(Hopefully-Reusable) Life Lessons for PhD Students in NLP

Elvis Saravia · Beginner ·📐 ML Fundamentals ·5y ago

Key Takeaways

The video discusses life lessons for PhD students in NLP, covering topics such as common sense reasoning, language models, and research productivity, with a focus on practical advice for navigating a PhD program and building a successful career in NLP. The speaker, Vered Shwartz, shares her experiences and insights on how to choose quality research problems, manage time effectively, and maintain a healthy work-life balance.

Full Transcript

um i'm fatma a phd student at the university of uh west scotland i work on nlp and social media stuff and i'm so happy to introduce very today varied schwartz that's how it financially yeah exactly yeah good and she's a postdoctoral at um the allen institute for ai and she does very cool research so i will let her to take the mic from here and she can tell us about her research so over to you thanks fatma um so um let me share my screen um let me know if you see the slide without the notes yeah we can see only the slides here great okay um so um thanks everyone for joining um so most of this talk is actually going to be not about research but more like a meta kind of talk about life lessons that i learned during my phd and which will hopefully some of which would be reusable for others um so let me start by um well i've already been introduced but let me start by just telling a little bit about my background so i'm currently a postdoc at the allen institute for ai ai2 and the university of washington and before that i did my phd in barylan university in israel and soon in a few months i'm going to be an assistant professor at the university of british columbia okay sorry yeah okay so uh let me start with a brief overview about my uh my work so my work is on common sense reasoning in nlp um and so in the last few years nlp has been almost synonymous with language models and um i'm going to use birth here as just an example one language model but in general the way that this works as you probably all know if you're doing nlp is that these models are pre-trained to read a large text corpus such as wikipedia or the web and then as a result they learn about syntax word meanings and factual knowledge and more and then the main paradigm in nlp is to fine-tune these models on specific downstream tasks for example um on sentiment analysis you can encode a text such as the chocolate cake is amazing and then classify it to positive or negative and the the fine tuning step is responsible for both understanding the task and also learning to solve it but there is some generalization issue and so it has been shown by a line of work that supervised models tend to learn um spurious correlations uh of the specific data sets and rather than learning the uh under to solve the underlying task and so um a few notable examples are first of all from computer vision so uh this uh image captioning model has been shown to um uh to be to have travel recognizing objects when they're placed outside their typical environment uh so for example it predicts a horse standing in a grass instead of an alligator standing in the grass and that's likely because most of the uh most of the horse images appeared with sorry most of the images with grass in the enviro as the environment appeared with horses as the animal and similarly in visual question answering this model learned to associate the type of the question such as how many questions with the most common answer for it in the training set so for example answering all these questions with the number two um and finally um this is more related to nlp um there was a few years ago there were several papers that showed that um natural language inference models tend to learn uh spurious correlations so for example this state-of-the-art model predicts that i only had a soup but it was very filling contradicts that i didn't eat a salad although it should entail it um i'll give you a second to think about it and then the reason that it it happens is likely because most of the negation words appeared in a contradicting hypothesis um so the bottom line is that we have models that are very good at solving the data sets that we're trained for but they don't really uh but don't necessarily solve the underlying tasks and they're often right for the wrong reasons um the thing is that as long as we're using machine learning to build nlp models then they're always going to be trained only on a sample of the situations that they may encounter in practice and so um in order to address unknown situations uh reasonably they need some common sense knowledge and reasoning abilities so a few examples a translation system sometimes needs to infer meaning that is implicit in the source language and translate it explicitly to the target language so here here you can see an example of grass-fed yogurt which should be translated to the equivalent of yogurt made of milk from grass-fed cows but instead google translate translated it to hebrew to the equivalent of yogurt with grass in reading comprehension if you if you see a headline such as stevie wonder announces he'll be having kidney surgery during london concert there could be two interpretations for this headline but using our common sense knowledge and reasoning abilities we can conclude that or eliminate the interpretation that stevie wonder performs during his own surgery and perhaps more importantly when building chatbots for giving medical advice these chatbots need to have some kind of social norms and ethics to know better than advising a a patient to kill themselves this luckily happened not in production but when it happened with a fake patient but still it is pretty concerning so it is kind of difficult to uh define common sense but uh i'm i'm using the working definition we had in the um nlp in the acl tutorial we gave last year uh so it is the basic level of practical knowledge and reasoning concerning everyday situations and events that are commonly shared among most people so that includes a range of different types of common sense for example physical common sense such as knowing that it's bad idea to put your hand on top of a hot stove um social common sense just knowing that it's impolite to comment on people's weight temporal common sense about typical times and duration of events such as dinner and more so i'm not going to go into the specific details of my works but i'm just going to mention a few lines of work that i've been working on in this area of language understanding and commonsense reasoning so one of my earlier works this was during my phd was on interpreting implicit meaning in noun compounds so for example we know that olive oil is oil made of olives but that baby oil is not oil made of babies but rather oil used for babies and although this sounds so trivial almost silly um it's not trivial for machines because it is implicit it's something that we just interpret based on our knowledge of the world uh in a more recent paper from last year uh i worked on introspective knowledge discovery so um let's use this example um children need to eat more vegetables because they're healthy um so syntactically the word they here can refer either to the children or the vegetables and uh to determine which uh which of these they refer to refers to um we need to um reason about some implicit knowledge in this context so the way that i uh propose to address it is by generating clarification questions or follow-up questions such as what are the properties of vegetables and then answering it with something like vegetables are full of vitamins and they can make you healthy and so on and this can be used to make that knowledge explicit and then maybe help determining the correct answer and a third line of work is on reasoning and specifically i worked on non-monotonic reasoning which is a mode of reasoning in which you can reach some conclusion and then give an additional information you can invalidate or weaken that conclusion so um i i worked on several things such as abductive reasoning counterfactual reasoning and feasible reasoning for example objective reasoning is um the uh make um coming up with the most plausible explanation of incomplete observations so for example um given the past observation that sarah wanted to make dinner for some guests and the future observation that she had to order pizza for her friends instead then we can uh reason about what might have happened in between and it could be something like she just realized she doesn't know how to cook and it could be something else as well like maybe she cooked and the food wasn't good she found out in last minute so she had to have this back up plan of ordering pizza instead and uh in physical reasoning um this is um i'm gonna use the famous example so if we know that tweet is a bird then uh we can hypothesize that likely it's likely that really flies uh but if we get additional information saying that tweet is actually a penguin um it's then tweety is still a bird but it's less likely or unlikely that tweety flies and um in in our work um also last year uh we also extended this and we said we can also generate uh or look at a sentence that strengthens the the inference so for example if tweety is on a tree then um it's more likely that really flies because otherwise how would twitty get into the tree um so this is uh pretty much the lines of work that i've been working on more recently and um there are multiple challenges here but one of them uh is if we want to teach machines common sense reasoning then we also need to we have we need to have some kind of um resource with machine readable um common sense knowledge and acquiring that resource there are some existing resources but they're not uh covering all the common sense knowledge in the world and acquiring it is pretty challenging um so in the past it was common to either acquire either collect it from people or extract it from text and then collecting from people is not really scalable because it's impossible to manually enumerate all the common sense knowledge in the world and it's also very costly and time consuming to try that and then extracting it from text is it it works but it suffers from reporting bias that is uh people tend to speak more about the exceptional and they speak about the obvious that everybody knows and so this is reflected in the text corpus and uh for example if you look at at corpus frequencies you might learn that the absurd fact that people murder more than they breathe just because it's more newsworthy and and then recently there's also this third approach that is becoming popular which is to extract such knowledge from pre-trained language models and this has mixed results because on the one hand uh these models seem to be capturing a lot of knowledge including facts that are not uh explicitly mentioned in the corpus so they can aggregate across different contexts and this is great but at the same time they're not really sensitive to negation so they generate other or they also assign high probabilities to negated facts such as birds cannot fly um they don't really differentiate constant facts such as zebras and black or black and white from contingent facts such as the color of my shirt i notice i always give this slide when i'm wearing either a black or a white shirt i should probably change it and also so last year in our em nlp paper we also showed that um these models tend to memorize facts pertaining to named entities and then use them even when they're uncalled for so for example um if you give gbt2 uh the prefix recharge has a bad then it generates something like habit of saying things that are not true and this is okay and also if you look at the attention weights you can see that most of the attention goes to the word bad and this is pretty reasonable in this very short prefix but then if you replace richard by uh donald then you end up getting something like reputation for being a racist and um again looking at the attention weights you can see that now the attention is equally divided by uh donald and bad and this is an undesirable undesirable behavior for most applications because nobody actually said that it's donald trump or don't know trump or uh um it's kind of deviating from the um from or assuming things that it shouldn't be assuming um and also um we also showed that uh language models don't completely overcome reporting by us so it is true that they're capable of assigning a non-zero probability to very trivial facts that might not be mentioned often in the text but at the same time they also amplify the likelihood of very rare and sensational events and here's an example uh this is a prompt um to gpt2 the men turned on the faucet as a result and then gpt2 says the men's blood was sprayed everywhere and this is not uncommon it happens every once in a few times you generate text and so in the future rather than learning common sense knowledge only from text it is beneficial to also learn it from additional modalities such as images and videos they often provide complementary knowledge that is unlikely to be discussed in text so for example looking at enough class photos you might learn that the kids in the first row are typically sitting but those in the last row would typically be standing and you need to be cautious about other types of reporting bias you might learn from um from these modalities for example learning from the movies you might learn that people typically hang up the phone without saying goodbye okay so that's it about my research before i move on to the main part of the talk uh does anyone want to ask questions um yeah i actually have a question i'm not sure if you said that and i missed that part or not but do you think uh reasoning can be used to enhance some of the nlp problems like rumor detection or emotion detection or something like that definitely uh thanks for asking that yeah i had it in a different uh slide deck i don't have it here but um i think that's specifically um counterfactual reasoning can help with that for example um i'm not sure about rumor detection but i did think about how it could be used for um false news like fake news or misinformation detection um because if you have some fact and you want to know if it's true or not then you can think um let's say you have some claim x and if you can show that if x is if x would have been true then that would mean that y is also true and you know that y isn't true and that's a way to prove that it's a fake uh or false claim um and yeah there are other applications in which these uh non-monotonic reasoning abilities could be used like um real-time summarization of an unfolding event you can update your beliefs when you get new information and um i had some other examples i don't remember right now but yeah i think it's really useful it just hasn't been uh people haven't people worked on that in classical ai but it was kind of forgotten almost completely in modern nlp and i do think we need to go back to looking at these problems yeah it actually sounds very interesting and i think with transfer learning or multi-task learning we can uh enhance cert like the current problems in nlp uh with reasoning um thanks uh thanks for answering the question thanks um yeah i will leave the space for anyone else's question okay okay uh if there are no questions about this part then i'm going to move to the main part of the talk and this is uh about life lessons for phd students in nlp and so i have to start with saying that this is my um this is about my personal experience and obviously not everything is going to be useful for everyone so take whatever you want from it um so my first uh tip tip would be uh to uh to be yourself just to not not to work on what everyone else is working on and um that's for mainly for two reasons one of them is just less interesting to work on what everybody else is doing uh and the other one is just that if you choose to to go through uh this route you're gonna live under a contest sorry live under constant fear of being scooped because uh um everybody is working on on if everybody is working on the same task then um uh it means that it's likely that someone will publish before you do uh and so i'm not saying you should work on a completely niche topic that nobody cares about but try to um maybe um stay away from one from the low hanging fruit or things that uh you know that other people are working on uh and this is just um these are just uh screenshots from one of them is from uh mark ray's blog and the other one is from the acl blog from last year about the more popular topics right now so again i'm not saying stay away completely from the popular tracks in acl but don't try not to do what everyone else is currently doing um and yeah this is another example for the lack of originality sometimes in the community so if you search google scholar for um paper titles with something is all you need you're gonna find uh dozens of them or i don't know maybe even more uh from after uh after the attention is all you need paper in 2017. uh i actually sometimes like having this like um follow-up papers with uh similar titles but when it's so many papers it gets tiring um so questions to ask yourself is what could be your unique contribution and generally just what kind of problems interest you and i think we're lucky in our field to be able to um borrow problems from the real world things that we um uh problems that something that we actually need um in our everyday life so try to think what kind of problem would be interesting to work on and maybe stay away from what's popular and then also choose your problems carefully so um conference deadlines are uh for better and worse and for worse they encourage us to work on smaller goals so you should be pretty cautious here because on the one hand you don't want to work on too many low-hanging fruit and you know do all this uh incremental work um but on the other hand and and also um yeah don't submit half-baked work and and this is not from a point of view of uh of uh an author of a paper but i'm saying it from the point of view of a reviewer uh because and maybe some of you are also reviewers and you know that it's pretty annoying to um to review a paper that is clearly not ready and that the authors submitted it just because it's the um it was the the they didn't finish it before the deadline and um they're always more um more deadlines try to to finish your work submit finished work but also um looking at the other extreme i know some students are need to hear that as well don't wait until you solve nlp to publish work on a long-term problem but then also break it into smaller problems that you can publish intermediate results for so this requires some kind of some some time to understand how to what is the amount of work that is publishable and this is some somewhat of a cliche but um choose quality over quantity uh and um here what you see this um this graph here is again from mark ray's blog and he's been doing this summary of uh or statistics about nlp papers every year and one of these statistics is uh who published the most first authored paper papers that year and although i'm not saying there's always a trade-off between quality or quantity and in fact some of these very productive authors are people that i know and i really respect their work i think they do very high quality work uh but for most of us it's pretty difficult to uh to optimize both quality and quantity so i think um after a certain threshold of number of papers that you have it doesn't really matter if you have more papers it matters more that these papers are going to be uh of high quality and i think it's true also when you're looking for a job i think yeah it's like having a very very few papers is not good but once you have enough and i don't know exactly what that number is but once you have enough it's better to have very good papers than to have many mediocre papers um don't reinvent the wheel this is something i keep saying to uh students um so i i came across this um saying last year uh reading wrote the mind i think i heard it um in in a talk in triple ai last year and i i strongly disagree with this saying and i think that actually reading saves you time and heartbreak and let me explain and so uh most of us are not creative geniuses and uh if you work on an idea and you don't know the literature then you're more likely to reinvent ideas that have been already published in which is the best case or in the worst case that have have failed in the past or um that are impossible and um and so i do think it's a good idea to do a thorough literature review in pretty early on in the project uh not not waiting for the for the paper writing for the few days before the deadline and uh try to figure out what what are the gaps in the existing work sometimes i think i i get new ideas from reading the literature and either seeing something that uh people mentioned that um is the limitation with the current models or uh maybe reading an older paper that suggested some idea but it technically didn't work and maybe i think that today we have better tools and maybe uh it's worth repeating that experiment um so i do think it's a good idea again at the same time looking at the other extreme don't be uh i mean i i don't think it's a good idea to read every paper uh that comes up on archive because you're just gonna be reading papers all the time and instead of doing any uh kind of other uh research work so i think you should stay focused and read more narrowly uh in your area than widely again with exceptions i do like sometimes to when i'm working on a specific topic i like to sometimes look for papers in other fields talking about something similar and getting some ideas on how this is treated and treated in other fields but in general try to to stay focused in on your area this one is a is a difficult one i know for many students uh it was for me as a as a new student um so don't overwork work is important but your personal life is also important and um so i think i think it's important to set work time boundaries and to exceed them only on special occasions like deadlines uh also you don't have to like if you finish your paper a few days before the deadline you don't have to pull an all-nighter just because it's at the deadline night but again we're going to have exceptions but the rules should be generally not to exceed work time something that helps is if you can if you have a work email and you can disable notifications on your phone i think that really helps i i didn't really have that because i um as a phd student i i use my personal email because i um the university email was a typo of my maiden name that they wouldn't change and it like was a strange email so i just use my personal email and then i i kept getting i got email notifications outside of working hours and it took me quite a long time to train myself not to immediately reply to that and to be able to wait with this for the next work day um now i have um i do have the work email on the phone uh but i don't have notifications for the working email so it's i choose when to go into the the inbox and check if i have emails um do other things other than work uh whatever whatever is good for you um i find that exercising and spending time with family and friends uh are important and of course needless to say sleeping and eating well and it's also okay to have hobbies you're not expected to give up on all your hobbies just because you're a phd student it's even okay to watch tv and to just you know lie down on the couch and do nothing sometimes sometimes you need that that's fine don't feel guilty about it um and yeah take time off of course um so there was recently a discussion on twitter uh from uh some people at openai that talked about work ethics and about and boasted about working 90 hours per week and i want to say i was like this probably not 90 hours because that seems crazy uh but i also worked really um evenings and weekends and um and then that was like in two first two years of my phd program and then i did an internship at google and it just happened that i didn't have remote access so when i left the office i couldn't work and i could ask for it but i noticed it was really nice to just have the evenings off and to have the weekends off and to travel and do stuff and i wasn't less productive and so i once i learned that um through this um lack of ability to work outside working hours i implemented that back at home and it worked pretty well um stop saying i didn't get any work done today um i i have to remind myself that too sometimes uh but many things are our work so email is work meetings our work teaching writing programming presenting your paper mentoring students and yes even looking for memes for your presentation is work learn to say no this is a hard one i think it's especially so when you're a new phd student you don't have a lot of opportunities to say no to so it's good to say yes to many things but then once you become more senior you're going to have a lot of opportunities and you have a limited time so you need to learn to say no so when i'm when talking about collaborations uh one thing you want to ask yourself is do i want to work on this project does it fit with my research goals and do i have time or am i already committed to too many other projects and um of course it defers it's it's a different question if you're trying to figure out whether to join an external collaboration or if your advisor asks you to mentor a student a student and then you have less i mean you don't want to say no because it's also good for you i think um and in terms of service i think it is important to give back to the community so if you're submitting papers you also need to review papers and depending on your career goals but i think it is also good for you to organize workshops or [Music] tutorials or in your area [Music] but once you have too many invitations try to prioritize the top conferences and also maybe workshops in your specific area and don't feel bad to decline if you're over committed think about it that way if you just say yes to all the invitations to review you won't be able to be as good reviewer as you would if you just um agreed to review for uh fewer venues and if it's difficult to say no then look up for you can google helpful phrases for saying no in a polite way this was pretty helpful for me in the beginning now i just copy from my previous emails about saying declining invitations um so uh learn to fail um this is something that i've noticed also with many students uh that they tend to fall in love with their research projects and sometimes think just doesn't it things that just don't work so you need to learn to recognize when an idea failed and move on and i'm not saying move on and completely forget about it you can um keep all the work that you did on that somewhere store it in some folder and maybe at some point you learn new things or there are new tools that can help you maybe complete that project so i'm not saying completely forget it but try to move on and to your next project um don't be defensive against criticism i know this is a hard one um but um you probably hear a lot of you get a lot of um feedback which sometimes feels like uh criticism whether it's comments from your advisor or reviews or whatever that is um it's people naturally have the the tendency to become defensive and i think what i did at some point was uh just try to be thankful and then take whatever i want from that so even like if you get comments from your advisor or something like that um you should be grateful for the time they took to um to give you feedback but also you don't have to implement everything it's eventually your work so uh um just take what you need you want from it um and get used to failing and recovering fast because academia is just full of rejections uh it's it's something that you just get used to after a while um i mean it's it's still painful sometimes but you just get used to it yeah so i see that i put a bullet here of a handful of my failures so let's think of some examples um so in my first year of phd i submitted a paper to acl it was rejected and then i submitted it to email p and it was rejected again and but i actually thought it was not a bed of paper so i um submitted it to star sam which is a more it's a smaller conference more focused on semantics and then i got best paper uh so that was a successful success story uh on the other hand uh the following year i submitted a paper to acl and it was rejected and i had an um one of the reviewers gave us the score of one which is really insulting but then i thought about it and said well it wasn't one of my good papers and i just submitted it because there was a deadline and i wasn't ready and i never actually submitted that paper again um yeah there's not really a moral early story but i'm just saying um you're going to get rejected many times and that's fine you just the way to um to maybe address this is to just try a lot more if you try enough eventually you're going to succeed and then your past rejections don't really matter and another example uh more more recent one um that i recently applied for jobs and um i i can say it was successful because i i got a i managed to get a job uh but of course i can also look at it in a you know different light and say well i most of the jobs that i applied for didn't even interview me so that's a rejection but again if you try enough and you eventually succeed then all the other rejection rejections don't really matter and i also recommend looking at this how i failed sirius from veronica chipley jaina i hope i'm pronouncing her name correctly so she's interviewing um many successful people about their past uh or their failures throughout the way um you should say i don't know often uh there's no shame in that um i i say that a lot um you're not expected to know everything and um if you don't know something anything you should know it then you should ask questions even if you think that they're basic i can say about myself that when i started my postdoc i started working on a slightly different topic than one i worked on previously and there were many basic things that i didn't know um i i was mostly focusing on nlp and then i moved to an area that has a lot of overlap with uh broader and ai and my background in ai wasn't very strong and so i had like very basic questions and uh that could have been embarrassing for me but then again i i learned from that so that's that's what's important yeah and you're um gonna end up knowing a lot about something that um wait what does it say uh learn more and more about less and less until you know everything about nothing um get out of your comfort zone so uh this is uh different for everyone else um different for for everyone it's like a very individual thing uh but one thing i think that's common for many people is um the um that it's scary to give talks um i actually personally am i'm generally okay with that this specific talk is actually a bit more stressful for me because it's more like a personal talk than a research talk but again um it's important to get out of your comfort zone um talk to other people whether it's you know in physical conferences with the when these happen again or um you know in other venues and invite yourself to places so i did that uh as an obscure uh first year phd student i had nope i had like one publication for my master and um i had one paper under review nobody knew me and i just happened to go somewhere on vacation and i threw my advisor i emailed someone in a university there pretty a good university and i asked whether i can go there and give a talk in the seminar and they said yes and i made some connections that way and that was really awesome of course on the day itself i was like what am i doing nobody knows me i'm talking about a paper that's not even published yet but i mean it was a good experience to have uh and for me personally even until today uh something that's hard to do is to send um some like important emails i can spend hours trying to rephrase them and check myself and like go over it again and again but yeah it's part of the work so i have to do that turn your weaknesses into strengths so that's again something that's really personal but um i'm gonna give some examples of mine uh so one of them is that my memory is okay it's not it's not really good and so i try to compensate for that with being very organized and documenting everything i'm also not perfect in that yet i'm working on improving um here's an example something that i um tweeted about i think two years ago that sometimes i i google something uh that i i don't remember the answer to and then i find my own answers in quora and i'm like i don't know whether i should be happy that i i wrote that or said that my memory is not better and that i don't remember the answer to that but but it helps me to be organized and here's another example this is from just a screenshot from my google drive i have this folder with a document of summaries of invited talks that i attended um a document with notes from conferences and a document with the names of people that i met during visits in universities uh that's that's an important one because i kept i just realized at some point i'm meeting too many people and i i kept running into people in conferences that i met in the past and i didn't remember so that that's just to save myself from uh embarrassments and another one is that i often i sometimes read papers and i don't really understand them if they're not written simply and that's something that i think is important to communicate um science in a simple way uh you know to and to reach a broader audience and and for this reason i'm writing um the um a blog for uh about nlp for non-experts this is uh not really a tip but a tip just something to be aware of um life happens so in um statistically in a four or five or even more years of phd program you will experience major life events outside work and so i personally had a health scare in my second year of phd i had to go to undergo a surgery i was okay after that i was there was not a lot of down time but still there were a few months before the surgery i wasn't very focused on work and i was really concerned for other people who studied with me they had all difference of all kinds of different events like good and bad ones like getting married having babies losing family members also health issues and of course needless to say that those of you who are students now or experience a global pandemic which is uh definitely affecting your ability to do work but what i want to say is that's okay take a break if you're needy don't beat yourself up for delays in your research um this is pretty much um you have enough time so uh it's it's pretty much given that there will be times where you find it harder to concentrate on work or you have to do other things and that's completely fine and um oh sorry and yeah i just want to say i hope that for you it's going to be mostly the good things uh but um you should take into account that these things happen uh and yeah i also want to this is again not a tip but i want to acknowledge that um there are some privileges that i have maybe that uh even if um you follow all my previous advice you're not gonna you might not get gonna get exactly where where i got i mean people are different uh so for me i i have to acknowledge that i have a very very supportive partner that uh i don't know cooks for me um feeds me when i'm working hard on a deadline and also was willing to move with me to the other side of the world for my work um also advisor and mentors along the way that care about me and that helped me advance in my career and talented collaborators that's really important my family is also supportive so my parents are not in academia but they always encouraged me to study scientific topics and i think that's important i know especially for women i know many that have been discouraged from studying um in such areas and for me it was i i was always pushed to uh study that so i kind of take it for granted that yeah of course this is my place and i should be working on um on science um and also probably some of my personal character affects that so i am very persistent and i i have what i believe to be a healthy amount of israeli chutzpah so i get things if i want something i'm not afraid to ask for it and i usually manage to get what i want and luck of course luck always plays a role um so let me sum up with uh asking the question uh what makes a phd successful and so um the answer is not make a huge impact on the field the answer is graduate most phds don't make a huge impact on the field and even if you do um it's pretty temporary because the pace of research in nlp is just so fast that whatever you achieve now is probably going to be less relevant in a few years but what is important is for you to choose the work to do the work that you will be proud of in a few years and um you need to love what you do because if you don't then it's you're going to suffer for um it's a really long time so you have to somehow find find the things that you love to do or if you completely don't love it you should consider maybe not doing that it's it's also fine uh but um it doesn't matter how much you love it you should still have other things in your life uh other than research and um for those of you who are currently students i want to remind you that getting to where you are now already required a lot of talent and um and so the rest just depends on working well and by well i don't mean working very hard or working many hours i just mean find out what works for you like i found out what worked for me and it also requires a lot of luck so i wish you all good luck and um yeah so i'm happy to take questions i also want to mention that i will be looking for students um probably not for this year it's too late to apply for this year but if you want to work with me in the future next year if you're not students yet uh then or if you know someone that wants to do a phd then you should encourage them to apply to work with me and uh that's it i'm happy to take questions uh thanks so much further for your talk uh and especially thank you so much for sharing some of the setbacks that you went through uh during your phd journey i think it's very important to tell phd students like us that it happens to even the most successful uh people in the field so it happens to everyone yeah yeah um we have katma here who posted two questions in the chat um so the first is dude yeah now we have three questions the first question is do they give feedback when a paper is rejected do how do you improve your work and submit it to another journal or to the same journal um yeah so i don't know if this uh if they refer to the advisor or the reviewers but i think um i think that is it let's talk about the reviewers so i when my paper gets rejected i tend to be and i think that's very natural i tend to be when i first read the the reviews i tend to be first of all very defensive and say oh no the reviewer is wrong there's nothing wrong with the paper they didn't understand it sometimes it is the case because you know reviewers are not always perfect but i try to not immediately you know respond to the authors if there's an author response response period in the conference or if there's a like if it's a journal paper um i tried to take some time to um to recover from the rejection and then get back to the um to the review and and then go over the the separate points that were the reviewers made and see which ones of them are actually valid and which ones there might not be much which one ones of them might not be completely true but maybe i can somehow change something in the paper that would make it better and um and which ones of them are just completely uh incorrect and and just um happened because the reviewers didn't understand and then i can politely maybe reply to them uh i do think it is important to try to um to change the to try to use these uh reviews to improve the paper um for the next submission uh but i also acknowledge that the reviewers uh in in nlp are sometimes not very good and there are going to be some cases i've had cases in the past where there was just literally nothing to uh no no action items nothing that i could do to improve the paper and i just submitted it as is and it was accepted for the next conference um so yeah it's different on a case-to-case basis um yeah i'm just going to go over the the questions one by one yeah so any tips on how to get published in top journals uh i wish i had um i guess um yeah i tried to choose important problems try to um um don't yeah i think in journals it's easier because sometimes you have like a at least i know in nlp you have like a rolling review so you can rolling um submission deadlines so you can you don't have you don't have to submit in a specific date so you can take time to uh make the work better before you submit it for the um for the next deadline um other than that i don't really know what to say it's uh there's uh yeah i don't know like do good work is not a really good tip i guess let's see what else how to create a good relationship with the supervisors are there patterns that we should be more careful uh that's a good question i think it's probably important before you start to maybe talk with the previous students or current students and figure out if there are any issues because sometimes you want sometimes people choose advisors just based on like who's successful in the field and that's i think much less important than um knowing that you can work with that person and that you can get along at a personal level because you will be working with that person for a few years so you need to figure that out before you start uh once you start um even the best advisors have their issues so you got to learn to to deal with them like i i i really have nothing bad to say about my phd advisor he's really great and he's been he keeps helping me um in my uh as my career advances but like one thing i learned about him during the phd is that he um he uh takes the time like if i if i want to discuss a topic with him then i should he he really works very differently than me like i could bring up some topic and then he dives deep into the smallest details and then there's no time to talk about anything else so i just learned to work around that so if i have a few questions then i would just start with the most important one and i would take into account that maybe we're not going to have time during this meeting to discuss all the other things you're going to have to learn to work around other people's work style and also again i think the the thing i said about getting feedback is i think that's in general probably true for all the advisors um that you need to um try not to be defensive just accept the the uh the feedback and say thank you and then take some time offline to go over it and decide which which of these uh comments you're going to implement and which aren't which you aren't going to um do you think it's worth to do a phd for those who don't want to stay in academia i think that's a very personal question i wanted i i knew i wanted to stay so for me that wasn't a question i think you should look into the jobs that you will be interested in doing and see whether they require a phd i know that some if you want to do research in industry in nlp or in other machine learning areas i think many companies do require i know my my brother has a masters and he worked on computer vision and he was looking into research positions and many of them required a phd so you might have to compromise on the type of job that you take but again it's depending completely depending on the type of job that you want to do uh because uh and also if if it is worth it if your second best job that you can get without a phd is good enough then maybe you should do that because just doing that for the sake of um you know a small difference between the jobs that you could be doing is may not be worth it because it's a few years of your life and it is a compromise in many in many ways for example financially it is a compromise uh you're not gonna earn the same amount of money that you would in industry in that time so it should be a personal question there's no yes no answer for that um what are the skill sets you're looking for in an aspiring phd student i guess i should come up with an answer for that because people ask me that recently several times um i think it's really it's really like i i just want to evaluate everyone um separately i don't have like a fixed list um i do think it is important to uh be technically strong and also to um to just have to want to work on the things that i'm interested at uh in general i have a pretty broad um research interest uh but i i don't want to um um i want to work with people that care about the same uh topics i think that would be a good match um how to find the right mentors and advisors that's a that's a hard question i'm not sure i know i feel like i was just lucky to find them i don't know if i have any so i i didn't mention that but the way that i started uh was i did all my degrees in the same university at barlow university and i um i just took an nlp course in the last year of undergrad and i decided to to stay for masters and and then i decided to stay for a phd and i i i just um so the professor that uh taught the nlp course uh daudagan uh i i just asked him whether i could be his master student and then he took the chance on me and i just stayed with him for a phd and so i had a for me i feel like it was mostly luck um but i i think you should probably um if you are looking for a phd position now then you should probably just apply broadly and try to uh talk with a few like with one or more uh potential advisors everywhere and try to just get the gut feeling from your meeting with them whether it's going to be right for you to work with them and also ask previous students i guess that's helpful because sometimes there are some stories that you don't know about um okay um okay so so someone asked i faced the issue in every study i'm part of i limit the scope of the study at the time of framing the research object objective but once i start working on the methodology each objective branches into many more sub-objectives or each method branches into multiple smaller methods this usually happens when i try to work on the problem and considering multiple perspectives this leads to the addition of too many elements in the paper making the study less focused and also confuses the reader and it needs to work around this okay that's a good question so i i think you actually have a good problem to have because i think for most students and i for me also when i was a student my problem was the opposite that i uh kept asking for very focused questions and my advisor kept telling me if you want to stay in academia you have to keep asking do you have to start asking looking at the um like to have a longer term vision and to ask them the more difficult questions um i think it's good that you have this vision that you know you're working on harder problems and i think that uh once you start you can just break it into multiple publications so try to just frame your the smaller problems as a research question worth uh asking and and publishing and um just um i mean you could still in the paper say i'm working on this broader goal but uh this paper specifically focuses on this uh sub element of that goal and uh and publish uh smaller goals i think it's it's also it makes for better papers usually when they're focused on a specific uh topic uh yeah and i think that was the last question here but if someone has another question they want to ask there's one more oh okay sorry where is that oh okay uh do you think for a phd student to be successful it matters the institutions top ones or more it matters the area and your supervisor um okay so i would be it would be naive to say that it doesn't matter uh which institute um but i can say for my myself that i um so i think usually i think in the in the field you would say well there are some top schools most of them are in the us and um you have to be you have to do your phd there to succeed so i want to say you don't have to because i did my phd in israel in a very good nlp group because i think that group of barline is one of the uh best groups outside the us uh but again i didn't do that it's not a top school it's not even one it's i don't think it's even the top school in israel it's like in computer science it's like not even the third i think uh but uh it was like one it's it's good to work with someone uh who is known in the field and um you could still succeed um if you want to stay in academia you probably will want to do a postdoc like i did i'm doing now um i think that really increased my chances to get a job and it is possible it's probably going to be harder but it is possible you don't have to go to a top school okay there are no more questions but i have one question related to one of the questions uh were asked early so when you work on your phd you target a specific problem is it better to look at this problem from different perspectives and try to publish in these different perspectives or to look at your problem from one perspective and every time you publish people try to improve um the paper in that perspective what do you think is the best for your study i think i don't know if it matters really i think it just um whatever uh makes you publish um like for the second one it's more risky in terms of like uh you could have very incremental papers and i do think you need to avoid incremental papers um but i mean i think depending on your career goal if you wanna i think like for for work in an industry it might be useful to be able to stick to a specific uh specific method and then try to improve it because the goal in industry is for things to work and work well um for um an academic career uh maybe it's gonna be looked at less uh like i don't know i'm just trying to hypothesize i'm not sure it really matters but i would say maybe try to avoid doing incremental work and then um you could just work on like similar tasks in the same area or um different um aspects of the same problem or different approaches for the same problem uh or just try to mix both of them probably doesn't really matter okay um okay thanks so much for the talk it was very very useful and full of information thanks for sharing a lot of your stories and i think that was uh inspirational anyway so thank you so much and everyone is happy and they just say thank you on um on the chat so yeah thanks again um thank you very much congratulations congratulations and good luck thank you very much good luck to everyone thank you bye

Original Description

◾ (Hopefully-Reusable) Life Lessons for Ph.D. Students in NLP ◾ Speaker: Vered Shwartz ◾ Talk Description This talk is part of the #wome_in_nlp talk series which invites women who successfully carved their career path in NLP to share their experiences and advice. Everyone is welcome to attend the talk not only women. ◾ Abstract This talk will start with an overview of the problems I've been working on in semantics and commonsense reasoning. Natural language understanding models are trained on a sample of the situations they may encounter. To address unknown situations sensibly, they need commonsense and world knowledge and reasoning abilities. I will briefly introduce some research problems in these areas and the challenges in teaching machines commonsense. In the second and main part of the talk, I will discuss the lessons I learned during my Ph.D., which will hopefully be useful for junior and future Ph.D. students. ◾ Learn more about Vered: Vered Shwartz is a postdoctoral researcher at the Allen Institute for AI (AI2) and the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Vered's research interests are in NLP, AI, and machine learning, particularly focusing on commonsense knowledge and reasoning, computational semantics, discourse, and pragmatics. Previously, she completed her Ph.D. in Computer Science from Bar-Ilan University. https://vered1986.github.io/ ◾ About #women_in_nlp Website: https://efatmae.github.io/women_in_nlp Twitter: https://twitter.com/fatmaElsafoury Slack channel on dair.ai: https://dairai.slack.com/archives/C01J0GXJMD1 ◾ About dair.ai Website: https://dair.ai/ GitHub: https://github.com/dair-ai Twitter: https://twitter.com/dair_ai Newsletter: https://dair.ai/newsletter/ Slack: https://join.slack.com/t/dairai/shared_invite/zt-pcxkmoip-b4nJkci8L_dynpMwLvlCcQ
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Elvis Saravia · Elvis Saravia · 22 of 60

1 101 ways to solve search (by Pratik Bhavsar)
101 ways to solve search (by Pratik Bhavsar)
Elvis Saravia
2 TLDR Generation of Scientific Documents | ML Interview #1 with Isabel Cachola
TLDR Generation of Scientific Documents | ML Interview #1 with Isabel Cachola
Elvis Saravia
3 Sentiment Analysis: Key Milestones, Challenges and New Directions
Sentiment Analysis: Key Milestones, Challenges and New Directions
Elvis Saravia
4 Discriminative Adversarial Search for Abstractive Summarization (by Thomas Scialom)
Discriminative Adversarial Search for Abstractive Summarization (by Thomas Scialom)
Elvis Saravia
5 Question Understanding: COVID-Q: 1,600+ Questions about COVID-19
Question Understanding: COVID-Q: 1,600+ Questions about COVID-19
Elvis Saravia
6 Getting Started with NLP
Getting Started with NLP
Elvis Saravia
7 Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
Elvis Saravia
8 TextAttack: A Framework for Data Augmentation and Adversarial Training in NLP
TextAttack: A Framework for Data Augmentation and Adversarial Training in NLP
Elvis Saravia
9 Dive into Deep Learning (Study Group): Introduction to Deep Learning | Session 1
Dive into Deep Learning (Study Group): Introduction to Deep Learning | Session 1
Elvis Saravia
10 Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
11 How I read and annotate ML papers
How I read and annotate ML papers
Elvis Saravia
12 Keep Learning ML  (Session 1) | DSV, CompLex, Modern tools for emotions
Keep Learning ML (Session 1) | DSV, CompLex, Modern tools for emotions
Elvis Saravia
13 Dive into Deep Learning (Study Group): Preliminaries | Session 2
Dive into Deep Learning (Study Group): Preliminaries | Session 2
Elvis Saravia
14 Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
Elvis Saravia
15 Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
Elvis Saravia
16 Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
17 Keep Learning ML #3 | Contrastively Trained Structured World Models
Keep Learning ML #3 | Contrastively Trained Structured World Models
Elvis Saravia
18 Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch |  Session 5
Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch | Session 5
Elvis Saravia
19 Dive into Deep Learning (Study Group): Convolutional Neural Networks | Session 6
Dive into Deep Learning (Study Group): Convolutional Neural Networks | Session 6
Elvis Saravia
20 Dive into Deep Learning (Study Group): Modern CNNs | Session 7
Dive into Deep Learning (Study Group): Modern CNNs | Session 7
Elvis Saravia
21 101 ways to solve neural search with Jina
101 ways to solve neural search with Jina
Elvis Saravia
(Hopefully-Reusable) Life Lessons for PhD Students in NLP
(Hopefully-Reusable) Life Lessons for PhD Students in NLP
Elvis Saravia
23 How to save the world and forward your career in 5 easy steps | Women in NLP Talks
How to save the world and forward your career in 5 easy steps | Women in NLP Talks
Elvis Saravia
24 Prompt Engineering Overview
Prompt Engineering Overview
Elvis Saravia
25 Getting Started with the OpenAI Playground
Getting Started with the OpenAI Playground
Elvis Saravia
26 LM-Guided Chain of Thought
LM-Guided Chain of Thought
Elvis Saravia
27 Elements of a Prompt
Elements of a Prompt
Elvis Saravia
28 Reasoning with Intermediate Revision and Search with LLMs #chatgpt #ai #llms #science #programming
Reasoning with Intermediate Revision and Search with LLMs #chatgpt #ai #llms #science #programming
Elvis Saravia
29 General Tips for Designing Prompts
General Tips for Designing Prompts
Elvis Saravia
30 Efficient Infinite Context Transformers #ai #machinelearning #research #llms #science
Efficient Infinite Context Transformers #ai #machinelearning #research #llms #science
Elvis Saravia
31 Best Practices and Lessons Learned on Synthetic Data for Language Models #ai #machinelearning #genai
Best Practices and Lessons Learned on Synthetic Data for Language Models #ai #machinelearning #genai
Elvis Saravia
32 Reducing Hallucinations in Structured Outputs via RAG #chatgpt #ai #llms #programming
Reducing Hallucinations in Structured Outputs via RAG #chatgpt #ai #llms #programming
Elvis Saravia
33 Basic Prompt Examples for LLMs
Basic Prompt Examples for LLMs
Elvis Saravia
34 LLM In Context Recall is Prompt Dependent  #llms #ai #chatgpt #machinelearning
LLM In Context Recall is Prompt Dependent #llms #ai #chatgpt #machinelearning
Elvis Saravia
35 Zero-shot Prompting Explained
Zero-shot Prompting Explained
Elvis Saravia
36 RAG Faithfulness #llms #ai #gpt4
RAG Faithfulness #llms #ai #gpt4
Elvis Saravia
37 Understanding LLM Settings
Understanding LLM Settings
Elvis Saravia
38 Llama 3 is here! | First impressions and thoughts
Llama 3 is here! | First impressions and thoughts
Elvis Saravia
39 Llama 3 is Here! #ai #llms #llama3
Llama 3 is Here! #ai #llms #llama3
Elvis Saravia
40 Microsoft introduces Phi-3 | The most capable small language model?
Microsoft introduces Phi-3 | The most capable small language model?
Elvis Saravia
41 Microsoft introduces Phi-3! #ai #llms #microsoft
Microsoft introduces Phi-3! #ai #llms #microsoft
Elvis Saravia
42 Make Your LLM Fully Utilize the Context #ai #llms #machinelearning
Make Your LLM Fully Utilize the Context #ai #llms #machinelearning
Elvis Saravia
43 When to Retrieve? #ai #llms #machinelearning
When to Retrieve? #ai #llms #machinelearning
Elvis Saravia
44 Training an LLM to effectively use information retrieval
Training an LLM to effectively use information retrieval
Elvis Saravia
45 State-of-the-art open-source LLM judges #ai #machinelearning #gpt4
State-of-the-art open-source LLM judges #ai #machinelearning #gpt4
Elvis Saravia
46 Better and Faster LLMs via Multi-token Prediction
Better and Faster LLMs via Multi-token Prediction
Elvis Saravia
47 AlphaMath Almost Zero #ai #science #machinelearning
AlphaMath Almost Zero #ai #science #machinelearning
Elvis Saravia
48 SWE-Agent | An LLM-based Software Engineering Agent
SWE-Agent | An LLM-based Software Engineering Agent
Elvis Saravia
49 [LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
[LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
Elvis Saravia
50 LLM-powered tool for web scraping #ai #chatgpt #engineering
LLM-powered tool for web scraping #ai #chatgpt #engineering
Elvis Saravia
51 Learn about LLMs in this NEW course #ai #chatgpt #engineering
Learn about LLMs in this NEW course #ai #chatgpt #engineering
Elvis Saravia
52 [LLM NEWS] KANs, Gemma 10M Context, OpenAI Updates?, Automatic Prompt Engineering, Tokenizer Arena
[LLM NEWS] KANs, Gemma 10M Context, OpenAI Updates?, Automatic Prompt Engineering, Tokenizer Arena
Elvis Saravia
53 [LLM News] GPT4-o, Project Astra, Veo, Copilot+ PCs, Gemini 1.5 Flash, Chameleon
[LLM News] GPT4-o, Project Astra, Veo, Copilot+ PCs, Gemini 1.5 Flash, Chameleon
Elvis Saravia
54 Enhancing Answer Selection in LLMs #ai #machinelearning #engineering
Enhancing Answer Selection in LLMs #ai #machinelearning #engineering
Elvis Saravia
55 On exploring LLMs #ai #promptengineering #chatgpt
On exploring LLMs #ai #promptengineering #chatgpt
Elvis Saravia
56 Transformers Can Do Arithmetic with the Right Embeddings #ai #machinelearning #engineering
Transformers Can Do Arithmetic with the Right Embeddings #ai #machinelearning #engineering
Elvis Saravia
57 [LLM News] xAI Series B, Codestral, LLM Guide, AutoGen Course, Symbolic Chain-of-Thought
[LLM News] xAI Series B, Codestral, LLM Guide, AutoGen Course, Symbolic Chain-of-Thought
Elvis Saravia
58 PR-Agent #ai #gpt4 #software
PR-Agent #ai #gpt4 #software
Elvis Saravia
59 Extracting features from Claude 3 Sonnet
Extracting features from Claude 3 Sonnet
Elvis Saravia
60 Has prompt engineering been solved?
Has prompt engineering been solved?
Elvis Saravia

This video provides practical advice and insights for PhD students in NLP, covering topics such as common sense reasoning, language models, and research productivity. The speaker shares her experiences and insights on how to navigate a PhD program and build a successful career in NLP. By following the steps and tips outlined in this video, viewers can improve their research skills, manage their time effectively, and maintain a healthy work-life balance.

Key Takeaways
  1. Choose quality research problems
  2. Manage time effectively
  3. Maintain a healthy work-life balance
  4. Focus on a specific area of research
  5. Read literature to avoid repeating failed ideas
  6. Set work time boundaries
  7. Disable notifications on personal devices
  8. Exercise and prioritize self-care
  9. Learn to say no to non-essential tasks and collaborations
  10. Give back to the community by reviewing papers and organizing workshops or tutorials
💡 The key to success in NLP research is to focus on quality over quantity, choose research problems that interest you, and maintain a healthy work-life balance.

Related AI Lessons

Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Learn how neural geometry relies on manifolds, projections, and hidden assumptions to understand complex data, and why it matters for AI development
Medium · AI
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Learn how neural geometry relies on manifolds, projections, and hidden assumptions to understand complex data, and why it matters for advancing AI research
Medium · Data Science
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Explore the geometric assumptions underlying neural networks and their implications on manifold learning and projections
Medium · Deep Learning
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Learn about the hidden assumptions of neural geometry and how manifolds and projections impact neural network performance
Medium · LLM
Up next
Machine Learning Project for Final Year Students | ML Project Idea @FameWorldEducationalHub
FAME WORLD EDUCATIONAL HUB
Watch →