Introducing Psycholinguistics into AI with Dominique Simmons- #23
Key Takeaways
The video discusses the introduction of psycholinguistics into AI with Dominique Simmons, covering topics such as cognitive psychology, neural networks, and multimodal training, with a focus on the intersection of language and psychology in AI systems. The conversation highlights the potential benefits of incorporating psycholinguistics into AI research, including improved understanding of human perception and attention, and the development of more effective AI models. Specific tools and techniq
Full Transcript
[Music] hello and welcome to another episode of twiml talk the podcast where I interview interesting people doing interesting things in machine learning and artificial intelligence I'm your host Sam charington I think you're really going to enjoy today's show our guest this week is Dominique Simmons applied research scientist at AI tools vendor dimensional mechanics Dominique brings a really interesting background in cognitive psychology and psychol Linguistics to her work in research and Ai and well to this podcast in our conversation we cover the implications of cognitive psychology for neural networks and AI systems and in particular how an understanding of human cognition impacts the development of AI models for apption such as media processing we also discuss her Research into multimodal training of AI models and how our understanding of the human brain has influenced this work in addition we explore the debate around the biological plausibility of machine learning and AI models really this was a great discussion before we jump in though I've got a question for you how would you like one of our beautiful this week in machine learning and AI laptop stickers I know you want one we've already sent stickers all around the world and we'd love to send you one as well all you need to do is pull up the show notes page which in this case will be at twiml ai.com talk sl23 23 and drop us a note with your favorite quote from the show you can also post the quote via Twitter just mention at twim AI or on our Facebook page links to all of these will be available in the show notes I can't believe it's miday already next week I'll be hosting my future of data Summit in Las Vegas as part of the interop ITX conference I've got a ton of great speakers lined up for the event including folks from Intel Microsoft GE Capital 1 level three communications and Walmart as well as leading industry analysts and startup Executives topics will span iot and Edge computing data management and of course machine learning and Ai blockchain and much more if you're planning to attend interop I hope you join us at the summit and if you've been meaning to attend the summit but held out until the very last minute it is not too late to register in fact you can do so using my 20% off discount code by visiting twiml ai.com interop of course if you have any questions about the summit please feel free to reach out to me via the contact page and now on to the [Music] show all right hey everyone I am excited to have Dominique Simmons on the line Dominique is an applied research scientist with dimensional mechanics uh how are you Dominique I'm doing great thank you how about yourself doing very well very well I wanted to start this conversation by talking a little bit about your background you have a master's degree in cognitive psychology and psycholinguistics and you have ended up doing work in artificial intelligence uh tell us a little bit about your background and kind of your path to working in AI so I I'll start from the very beginning um as a child I was always fascinated by the brain okay um I grew up as an only child and I found myself journaling when I wasn't with my other friends i' journal and um make observations about you know what people are doing um you know just what was going on in my environment and what I was fascinated by what made them do the things that they did and that carried on throughout school and eventually in college I um I had a great mentor who who brought Neuroscience into our curriculum okay and I started you know tbling for brain awareness week I started uh you know learning about the brain and uh the circuits and and and you know just how these processes come about and so again that carried me through um I I knew that I wanted to do that in graduate school and um after becoming a a lab manager University of Illinois Urbana champagne um uh being a postback intern at uh UMass ammer um I got a lot of exposure to different types of psychology um anything from you know infant psychology cognitive psychology psycholinguistics um music cognition you name it I was just very fascinated with all these aspects and and you could see these par coming together as well um so uh in graduate school I studied uh specifically multi-sensory perception which is the influence of one sense over another okay and how senses interact so I come from a school of thought where the brain is U agnostic if you will to input and uh once the input you know is is processed then you know it becomes sound or uh it becomes hearing or you know any one of these senses but uh the actual input is is is is just information the brain likes getting information um and and and you know learning and processing um and then at the the uh later stages that's when it becomes uh what we know as as uh senses sensory input H okay so with that that gave me a unique background so uh I will say that in graduate school I got a little bit tired of the theory you know of course it's uh it's you know critical it's a critical foundation for a lot of work but I started to get you know to itch for applications of of these ideas right you know this idea of integrating senses how can we how can we make this into a device you know what how can it help you know um non-hearing populations or populations that have uh sensory deficits um and so I I worked on um a project a brain training project on hearing um veterans with hearing loss and essentially was a a a program uh like an auditory game auditory uh video game okay where uh they had to to learn complex sounds and in order to navigate through the game um when the sound goes up they have to jump if they go the sound goes down they had to you know duck under something oh interesting and so yeah and so that was my first taste of the applied um aspect you know of things the the applied world and the goal of that was assessment or trying to uh rebuild new neural Pathways to improve their hearing or something else it so it was a little bit of both but really the latter um really trying to um uh rest strengthen those connections those auditory connections okay uh through training them with with complex sounds and so we build the complex sounds um in mat lab and other tools and uh you know we at that point we were doing some initial testing but it's gotten gotten pretty far it's gotten to the point where it's in an app oh wow yeah so that was part of my graduate work and then um I ended up so it's funny how I got into AI it's a bit of a jump so um still on that applied path I ended up here at dimensional mechanics as a user experience researcher in in uh VR that was the initial um space that we were in okay creating uh VR cont content and and and uh optimizing it oh interesting yeah and so that's uh that's where I so I was trying to work on things like immersion um uh en enriching the user experience of someone in a VR environment um what are the perceptual aspects that come into it um and how to build a better immersive uh environment for the for users okay and and Al and also you know avoiding things like um um uncanny valley and uh nausea and all all the the not so good stuff that goes with VR but uh eventually an uncanny valley is what oh yes on Valley I forget what what the phenomena is yeah so it's when you see a character that is humanlike but not quite there um and usually you can tell in the eyes uh that's usually a giveaway and you start to get this um un uneasy feeling because you know you you see the humanlike quality that could be there but it's not quite human enough okay um and that's a a big issue with um characters and VR and virtual environments so um but eventually we pivoted into the the AI space and so I've been using um my background to build help build in the cognitive components into our system um things like decision making and perception and and memory as well uh just out of curiosity what drove the pivot from VR to AI I it was uh you know business decision but um I believe that we saw a bigger um opportunity if you will in AI um especially a general AI platform that we're working on uh VR that is is you know is vast as well and there are a lot of areas that you can go into um that people don't necessarily recognize it's way more than the entertainment space but AI can it affects almost every single field there is um almost every single area of business I I was on a a panel a few months ago and someone asked you know is there is there any area that you can think of that hasn't been touched by Ai and there you know we tried to come up with one example but if it's not already it is going to be affected yeah huh and so just uh for context uh and I really want to dig into your you know how your background ties into your current work but for context you said that uh at dimensional mechanics you're working on a platform for did you say a general platform for AI or a platform for General AI I thought I saw the ladder uh on the website or something or someplace yeah so it's a general um AI platform essentially a set of development tools for companies so that they can reduce their cost in uh engineering costs in Ai and machine learning it's AI is very expensive to uh produce and you need quite a bit of Manpower and especially you know you have you have to have essentially you know experts uh currently but we are trying to um do away with that so that someone who is interested in AI um or you know a company that maybe doesn't necessarily have ai and ml experts they can still build models for uh to solve their their problems okay got it I thought I read uh that you guys were going after the the AGI problem artificial general intelligence um but it sounds like it sounds more narrowly constrained than that a little bit yeah I mean we we're we're definitely adding to that conversation um and and adding to those efforts we want to build general AI but um actually and uh should mention that we are currently in the media space so you know you have to narrow it down somewhat um so we're currently working with companies to uh build AI models for images video and text okay and I imagine that the media applications lend themselves particularly well to your background and a a cognitive approach to you know there are there are clear cognitive elements to AI uh with regards to Media can you talk a little bit about how those intersect right so with let's say you know images and and video for example I come from it uh actually as a matter of fact I'm working on a computer vision um project with a local University and we tried to find parallels between you know how uh humans ingest or um process I guess perceive media that's the best way to put it and how a system would do the same thing and it's you know obviously there's there going to be vast differences between how a machine does it and how human does and we're not trying to um make a onetoone mapping but there's a lot to be said there there's a lot of there's a there's a lot of in information and there's a lot of inspiration in in how a human uh perceives media and that we can apply to how a machine um perceives media so for for example and and you know in computer vision you can think of it as a like a multi- multi-disciplinary field because you're drawing from vision science you're drawing from computer science um psychology you know cognitive science um and so you know and part of that is looking at you know exactly um so so part of the the way that uh researchers create ground Truth for for computer vision experiments is they look at human ratings um they'll have you know humans view a set of uh videos do eye tracking and whatnot and then try to apply that you know that that eye gaze data to the system at hand so that can be built in so you know in in by using those human ratings you can build smarter systems just to make sure I'm understanding all of what you're saying um so we can Envision a a system where a labeling system where you've got uh some humans that are trying to label uh a set of images and you're saying that in addition to just the label that they're you know that they may type in to you know or set of labels that they might may type in you're also doing something like you know ey tracking with a camera or something like that to see where they're looking on the images and that helps you to refine the training process right there's a lot of there's there's a lot of Rich information um there can be from from itracking studies and in you know how humans um what are what does the i gaze data look like what you know one of my focuses is uh no pun intended but attention user attention okay in that when we take in an environment when we're looking around there there's tons and tons of stimuli there's no way that we can process it all at once and so with humans the whole scene is not relevant at the same time right it's going to be relevant in basically spotlights so you can apply that with a system as well um if you may want to enhance certain parts of images or video um but you may want to do it in a fashion where it's focusing on the the most relevant information in that scene uh and so back to your grad school research on multimodal and and kind of this model of the brain where where you know you have just a bunch of inputs and uh you're processing them you know kind of as equals are there other examples of inputs or other examples of this of the multiple modes that you're incorporating into your work today not particularly currently but I would like to see um and this is something that I'd like to implement later and it's the um I guess intersection or combination of audio and visual okay uh I have seen some early work in this but you know if let's say let's say you're watching a video and there's there's likely going to be audio as well and where you're focusing your attention um is going to be not only influenced by the the visuals but also and and also by the audio but the what's critical is the combination of the you know Spa spatial temporal um information that's that's something that's going to heavily influence um where your attention is guided in a scene and so I think there needs to be a lot more uh research in that in that space um you know previously there's been a lot of studies on on this but um you know there's been a lot of focus on the visual obviously and there's been a lot there's been some focus on the audio but um really where you're going to find fascinating uh insights is the the combination what's going on spatially what's going on Temp temporally and that that intersection and practically speaking how might you expect that to impact an AI project so let's say you build a model to detect the tone of a commercial okay well that's going to use both video um the the visuals in the in throughout the commercial and then also you know advertisers uh put a lot of focus on the the auto auto auditory aspect as well um the the you know the this the the mood of the the the mood of the music um going with you know what's going on in the scene so with an AI system if you're able to train it on the video and then train it on um the the audio and then yeah I can I'm thinking of like kind of a thought experiment right now yeah so you can see if it can recognize the the tone of the vid video just solely through the visual and then see if how well it does with just the audio but I bet you that when you combine both of them you have both types of information that will um allow it to best categorize the um the tone of the video got it I mean it sounds like the basic idea is just try to use all the information that you have to to make more accurate models and uh often it turns out that the information that you have comes in you know multi multiple modes some of it is some of it is a audio some of it is video um you know and there may be others as well right uh that's yeah exactly if we you know the more information that we have available the more information we can train our systems with and that's how we as humans learn as well for example in the the education realm you know you have a um video of a lecture you have an audio of a lecture but if you are not only uh you know you're using both and you know it's the best is if you're actually in the classroom immerse and you have um both both sensory streams coming in the visuals and the auditory you're going to have a better chance of you know remembering the content and and being able to build off of that as well right right now a lot of your work uh calls into question the the whole notion of um you know biological plausibility for neuron Nets and the extent to which we should be trying to model neuron Nets and AI systems after human systems and in fact you wrote about this in a blog post uh last year uh what are your thoughts on you know this whole question it's it's debatable uh definitely and depending on you know what your background is uh I've seen some hardcore computer scientists where they're like ah you don't need this plausibility but since my my background is very brain heavy um I definitely want to include that in the conversation say hey okay well um to what extent do we need uh to model these systems after the human brain it does not need to be a one on onetoone mapping I do believe that because with the brain the human brain and you know a computer the lower level bits are very different um there's some similarity there um with with neurons firing and um the the binary bits for computers but it wouldn't be in our best interest there are so many other things that we need to tackle right now it wouldn't be in our best interest to try to make a onetoone mapping um but these systems in my opinion should be biologically inspired so we can take Concepts like modularity um or or or Integrated Systems um localization we can take all of these aspects from neuroscience and cognitive science and apply them to building these models and I personally try to keep that in mind um when I'm uh building various AI models now I think of modularity and locality as computer sciencey things uh as technical things can you talk about how those Concepts Express themselves biologically and how they've influenced the types of models you build sure so with modularity uh it's the idea that the brain processes um there there are different areas for different processes like there's a um an area for language there's a area for for motor there's an area um for you know other other types of processes and systems okay so and they're believed to be all-encompassing if you will or uh you know self-contained right I'm not particularly from that school of thought I think there's there's there's relevance in that in that school of thought but there's also a lot of ton of interconnectivity so and that's really you know it's it's literally all connected and you know even if there is some localization uh particular area for language that's going going to influence the the visual processes that's going to influence you know the the auditory processing so it's you know in my opinion it's really all about interconnectivity okay but as far as building AI models so this is kind of going back to um the thought experim experiment that I brought up but you can think of it as you know when you're training these models you can think about what kind of the I guess you can think of it as terms of uh sensory input and and separate streams or combined streams so you can you know feed in audio you can feed in video and then you can feed in the the combination of that okay and you can think of that as as the the interconnectivity equivalent if you will so this reminds me of some conversations I've had recently about folks working on deep neuron Nets and you there this question comes up when they're trying to develop these complex deep neural networks whether they you know whether they're ultimately developing this kind of this single model that takes in all the input and produces all of the outputs and it's kind of Monolithic or whether they develop you know what looks more like a an ensemble uh in a sense of you know or a hierarchical uh neural network model where they've you know they're taking in you know inputs and and training the model to be able to determine you know some kind of higher level feature uh and then they have you know many of these in parallel that they feed into kind of a higher level neural network to you know make the ultimate decision and this may happen in in several layers uh it sounds like what you're saying is that that is ultimately more you know that's closer to what the brain is doing than the you know we think of the brain I guess and I guess as lay people as a kind of this monolithic thing but uh you're saying that that you know in many ways it's kind of hierarchical is that is that accurate yeah I mean and you can take a particular system um the visual system for example so you know you get the early visual sensory input you know it's really Globs and shapes and shadows and then as the input gets further up the stream to V3 and V4 then you start to actually make out a particular image you know a scene of a park or a beach or whatever it is so yeah go it goes from sort of like a lower level abstract to the more and yeah I mean I agree with the idea that you know we have these lower level processes that are done first and then as they are being carried out and further processed then you get things that are more fine fine-tuned fine grain if you will MH you can also apply that to the motor system as well um at first your when your motor system is still developing you know you're going to have clunky uh noncoordinated movements but then as you go further along and you you can uh fine-tune them and fine train them then you're able to write you're able to type um you're able to do more fine-tune movements are there a set of principles do you think that you know anyone working in this field should be uh should be thinking about as they're approaching you know developing AI systems uh with regard to uh you know this whole issue of biological uh reference or plausibility I would say think in terms of uh you know being dis uh interdisciplinary there there are a lot of areas that come together in Ai and I think in order to be to to develop successfully you need to borrow from each of those um like I said you know computer science cognitive science psychology all of these these different um areas are going to get you are are going to help you build a better Foundation as opposed to you know just focusing on the the um computer science uh theories or the you know cognitive science theories you really need a combination of all of them are there a set of canonical references that folks should take a look at to dig into this area more it wouldn't hurt to read up on the history of of AI I think it's a little known fact for some that it's actually been around since the 50s um a man named uh Marvin Minsky was you know he's considered to be the the father of AI I would I would read up on on him his work I'd read up on Alan Turing he laid some of the foundational you know thinking in AI uh yeah it start there yeah awesome awesome so maybe let's let's try to dig into um you know some some more concrete things are there uh specific applications of what you guys are doing at dimensional mechanics that we can maybe talk about or what you know ways that you're helping customers develop AI applications with your platform yeah so we have uh a demo on our site actually I wish I could go into a lot more detail but um for now I'll just mention the the demo and and that's an image ranking um demo where you can upload a photo and it will uh give it a score okay and relative to the the other the existing photos on there you can see them you can scroll through them uh there's a top top 10 list okay and it's scoring it from what perspective the so it's trying to give it the best ranking so it it's it's using a lot of different metrics can't really go into the particulars but um but is it is it trying to label the image or trying to rate its aesthetic quality or yeah it's trying to rate its aesthetic quality okay got it so it's kind of a I don't know if anybody remembers the Hot or Not app it's kind of the Hot or Not app for for images right right we um that's brought up and that's been brought up at our discussions okay um but yeah it's like I said it's um rating the more aesthetic qualities of of an image okay and in what ways has the your work around the cognitive psychology influenced uh how a system like that works so part of this is going is going to go back to you know how how a person perceives an image and I'll just give you know very general example so you know when we see an image we're taking in a lot of aspects um the Shadows involve the lighting the angle of object the you know the the sort of the busyness or you know the the the contrast um in involved so we're assessing all of those things um when we're an image and you could say that the system is doing something similar okay uh this is this is kind of getting me into another space so one thing it's interesting so have you heard of the blackbox problem uh generally yes generally I've heard of a blackbox problem I don't know if it's the same one that you're when I think of the blackbox problem I think of that uh from an AI explainability perspective right right and that's that's what I was going to get into okay so for and then there's a parallel with humans as well so you know we as researchers we present a set of stimuli and then you know a person responds to those given stimuli but what's exactly is going on in between is uh you know it's debatable sometimes right more more times than not and with AI uh that's one of the things that we're trying to work on as well not necessarily dimensional mechanics but as a as a field yes um trying to demystify what's going on in between um and there there I I think that taking a a good hard look at the the training sets that you and and manipulating those in a way where you know you're feeding in sections at a time and you know that these have this section has particular features this one doesn't that can possibly get at at that question but it's going to take a lot more more work and I think once we do that it'll we'll be able to AI models will be that much more valuable because we'll be able to tweak as as necessary right right uh so going back to this demo application you know presumably you've trained you developed some AI model uh to rank these images and you've trained it on lots of input data uh did the multimodal training come into play in in this case not in the current iteration but that is something we would definitely want to explore on to make it smarter if you will right yeah yeah I can imagine um you know if someone is if someone is rating images on a a numeric scale but you also had a camera looking at them observing them then you there's a ton of additional information like you know the creases in their eyes when they smile you know the smile the eye you know that could that could perhaps uh lend some additional insight as to whether this is a visually appealing image right there's so many factors that that come into play you know like I said whether you have the right lighting whether the the expression on someone's face um what that's conveying there there's so many things so uh it would be great to further explore it we've built it on a a you know particular set of of features but um we would definitely like to expand that in the in the future are there any other uh kind of applications or use cases that um might be interesting to explore oh there are tons it's really about having time to explore them yeah yeah yeah there it would be great to get a sense for you know I think we you've given us a sense thus far that having a cognitive psychology background can really lend insight into you know ways to think about you know your modeling and your training uh and it would be great to you know then talk through some examples of you know how how that's kind of played out in kind of a customer scenario or a you know just a practical scenario so that folks can kind of see the line kind of go from beginning to end if that makes sense okay and I've tou yeah and I've touched on this a little bit uh earlier but you know we're looking at what we're looking at user interest and engagement essentially you know when you see a scene what is what's relevant to you what what catches your attention right and once various factors of that can can be identified and by the way we're you know looking at both lower lower level features things like um line orientation and color you know very low-level features that would grab your attention automatically if you will and then also higher level things like uh emotion things that uh more I guess there's a term pop down um attention Okay which is is intentional based your your um looking at something intentionally okay um or you know goal oriented so the possible applications of that that research so I've you know talked about uh the advertising space right so you know for example if advertisers can take this research and realize oh okay at the 302nd mark this is where this particular you know location in the the scene this is where people are going to be most engaged uh most they feel like there's something most relevant in that in that uh particular area well that could be a great placement for product ADH you know or product logo you can also get a sense of how long you're going to be able to sustain someone's how much how much time someone's attention is going to be sustained you know in a world where we're on our phones all the time um that that Gap is shrinking so you know with advertisers it's that much more critical to find what's relevant and and get in there before uh the user's uh attention is lost okay so then you're through this research you're developing models that can that can look at uh is it only static images or is it uh video as well in this project it's video as well okay actually and this particular project is video oh okay so you're you're looking at you're basically training models to to model human attention and interest uh on these videos and then you can use that to help advertisers assess their work for example so as opposed to convening a panel or focus group to try to get a sense for whether an advertisement is effective uh you know which is probably expensive and time consuming and maybe not even all that accurate you can use these models to screen the the advertisements to or for screen the advertisements for Effectiveness is that the general idea right and going off the accuracy point so so you know a lot of times in focus groups and whatnot there there's a verbal or written response but a lot of times what we say is not necessarily reflecting what's going on internally right right and we try to get at that as well um we currently use tools like um ey tracking and biometric sensors so to get at the physiological response es to to the input to the video input okay and I I forget if we cover this but what do you call that that phenomenon like I know a related idea is I guess how you would call it attribution error or you know issues around attribution meaning if you if I look at an image I can tell you I like it but I can't tell you necessarily why I like it is there a specific is there specific terminology for um you know the other side of this which is you know I might not even I might say I like it but I might not really like it or vice versa right well there there a lot of things that come into play especially in an experimental setting so there is experiment or bias if you know the in there might be some influence of the experimenter you have to be careful in the way that it's asked if you are going to ask sometimes the question in itself can be loaded and and and biased the response uhhuh this is getting into another area but uh with eyewitness experiments the way that you pose a question can definitely influence how the the person will will respond you can you know let's say there is um an accident and you know you you throw in an object in the question like uh at the stop sign blah blah blah blah well the fact that you mentioned the stop sign you know even if there wasn't one there the person may still agree say yeah okay I remember the stop sign it's like no there wasn't even a stop sign there at that particular scene so so there's experiment or bias there's so you mentioned attribution but there you also want to be seen in a particular way as well when you respond um so you have to keep that in mind and we may not even be conscious of that we just we want to have we want to produce positive responses so you know to get away with all get away at all that I think a great measure is is more something physiological that you know it doesn't get a chance to reach our thoughts uh interesting interesting so beyond just this attribution issue and and my ability to articulate there are is a whole host of cognitive biases that can um that can distance someone's real perception of a an image or video from what they ultimately say in some kind of panel or focus group and thus being able to develop machine models that can you know not just rate the the image or model the you know the a human reception of an image but also maybe tell us a little bit you know as we kind of get further along with the black box issue tell us what the issues are you know uh this one may be too dark here or too contrasty there that kind of thing right right to give you know possibly a more uh objective measure if if you will MH it sounded like you're wrapping up this research project uh um what's on the horizon for you are there any areas that you're any particular areas that you're looking forward to exploring further right now I uh am working on a learning more about the natural language processing space which I find really fascinating um especially with my psycholinguistics background so for those who don't know natural language processing um is the ability for systems to take Natural text you know um anything as short as uh words phrases and to documents full documents novels even and uh be able to get insights out of that input data things like I mean you can get more technical things like frequency counts and and whatnot but you can figure out the the sentiment of a particular uh text you can figure out all kinds of things that you know it would take a person hundreds and hundreds of hours to do right you can just load in all of these uh documents and you know get a score for sentiment is it particularly positive or negative or all all kinds of uh different insights and what uh what is peing your curiosity around uh NLP and uh the application of psychol linguistics to to that field maybe we can start with uh what is psychol linguistics relative to you know traditional Linguistics or other aspects of linguistics sure so Linguistics is you know the study of language the parts of language the structures and whatnot and psycholinguist ICS it's bringing in Psychology into that you know you're you're looking at things like the the way in which people say things um that term is pro the inflection in someone's in voice to confe certain messages okay it's you know it's fascinating you take one sentence the man went to the park the man went to the park the man went to the park you know the you can say the exact same thing but put different inflection on it and has a completely different meaning okay there's you know the way uh the way in which we produce sentences and the you know syntactic structure and how that affects both the producer of the speech and then also the you know listener as well um it also gets into you know co-articulation and these other mechanics of speech that what's coarticulation coarticulation is when it's the it so it has to do with the flow of words when when we're speaking all of the words seem discreet but if you look at an audio form uh waveform you'll notice that the sounds actually overlap and so that's the cotic culation then you also get into the social aspects of speech so when uh two people are conversing and there's a um phenomenon that occurs called common ground and that's when um you know you you start out using different terms but over the course of the conversation you start to use similar terms to each other if not the same ones ohes because you build up bu up a rapport uh gosh I mean there's there's so much there's there's also this phenomena where um you know you're building up that common ground but then you also take on a similar speaking style to that person um and that can also convey that you you like that person you can also and that's called convergence okay there's also Divergence where maybe um you're you're not a big fan of that person you start uh uh changing your speaking style maybe you know unconsciously but um changing your speaking style nevertheless H interesting so when I think of natural language processing I tend to think of applications that are you know either primarily textual or um you know translation types of applications uh but your you know just your little explanation of psycho Linguistics and some given some of the background we've talked about you know it strikes me that there's a ton of interesting work and exploration to go into uh what's so how do I articulate this well maybe to articulate it by example like when you create a neural network to recognize images right when you know we know a little bit about how those neural networks structure themselves and you've got kind of your Edge detectors and your you know shading detectors and all those things that emerge I wonder the extent to which uh the concepts like Pro and other things if you're training a neural network on speech samples you know if there are regions in the neural network that emerge that somehow reflect you know Pro for example or if that's a if that's you know is that kind of the current Frontier research yeah that's definitely on the frontier of This research in this particular space um especially audio you know VI visual inputs have been fairly well studied in the AI space but audio less much part of the issue is the shortage of data in that space there there are there are quite a few open-source video and image data sets but not so much on the audio side but going back to your point yeah I I believe that what you would do is you would set you know Pro up as a feature you'd set I don't know co-articulation or these other you know syntax you you'd start to set these up as individual features to train the models meaning that you would you would have humans identify them somehow and label them or you would have how would you set up Pro for example as a feature uh it's a good question so initially it would likely have to be partially at least partially supervised and you'd want to use some sort of existing ratings so probably human ratings and I can imagine you know you have a waveform and they you know listen to the snippet and Mark places where it you know um goes high or goes low um this this is also evident in the physical waveform as well so uh I could see that even being done without human ratings but you know the system you know you feed it various waveforms and you feed it the the sounds that go with it it should be able to learn the parts where the W the the audio goes up and then you know the the times where the inflection goes down but the tricky part would be associating that with a particular meaning and that's where you'd probably need humans to come in so you know humans would tag the particular sentence as oh this the emphasis was on this or the emphasis was on that and then once you have those list of different emphasis you could um train the model on that and so ideally it would know when the inflection goes up that's where the the emphasized meaning is uh I would imagine that traditional Linguistics has a lot to offer in terms of you know just how to represent all of this stuff like it strikes me that you know just there's a representational challenge in you know if someone were to try to take this approach to you know building and training models but um you know certainly linguists have been you know representing Pro in some kind of way and developing ways to map that to specific meanings is that correct yes that is that is uh correct and so it's definitely definitely worth a look if someone is trying to um train on audio uh models to you know take a look at that space that's why going back to uh what I said about being you know um thinking multi-disciplinary um wouldn't maybe you know wouldn't maybe be obvious to go to the psycho Linguistics um area but it like we've been we've been saying it can give you some great insight into how to train uh an audio model and it'll give you more than just the the surface characteristics of um a waveform or or the audio it'll you'll be able to you know train it on um meaningful in insights as well his Pro is not something that you can necessarily you can see the inflections but you can't necessarily see the meaning um on the physical waveform you need you need to add that inside right uh in addition right oh this is a really really interesting space any any other thoughts before we wrap up um no I I I think this is a great space as well uh what I I particularly appreciate about it is like I said the multi-disciplinary aspect of it we're building these very complex systems and I just mean that as a as a field right um in addition to my company but you know we're yeah we're building these complex uh models that you know in some respect reflect what's going on in a human brain but we need to keep in mind that you know the more complex they get the the more information that we're going to need to seek out and it's going to come from different places yeah I uh I can definitely see that um I don't I think I've commented here on the podcast or um certainly on Twitter that if I wasn't so busy trying to figure out this machine learning and AI thing uh Linguistics would be high on my list of things to figure out it's a it's a fascinating field and it's great that you get to uh combine the two yeah it is great and one thing I did want to mention is that ideally you know the system would be language agnostic um so you know we're really teaching it about human language whatever uh it could be French Spanish you know English whatever it is right um which is very helpful and it can be used in like you alluded to um translation tools yeah my favorite example of this in fact is from a conversation I did with uh recently with Sho s Gupta from Buu labs and he talked about how uh they were able to build a uh English to Mandarin translator I believe it was English to Mandarin translator before they even had any you know without having any Mandarin speakers you know on their staff you know just based on you know this Pro this property that you're describing the the the fact that a lot of the application of this is uh language agnostic exactly because that's yeah it's not working on those particular nuances if you will it's it's looking at it at a with a more agnostic view awesome uh well before we go what's the best way uh folks want to connect with you or get in touch um what's the best way to do that you can connect with dimensional mechanics on Twitter at DM ncore Ai and uh we're also on Facebook and Linkedin you can connect with me personally at artai a t scii uh two with two zeros um at Twitter okay so artai z00 yeah awesome awesome well Dominique thanks so much for being on the show it was great chatting with you and looking forward to reconnecting soon thank you Sam I really appreciate it thanks for having me on absolutely take care bye [Music] bye all right everyone that's our show for today once again thanks so much for listening and for your continued support don't forget to share your favorite quote from this show via the show notes page Twitter or our Facebook page and if you do we'll be happy to send you one of our laptop stickers if you're planning to attend the future of data Summit next week please reach out and let me know to look out for you the notes for this show will be up on twiml ai.com talk sl23 where you'll find links to Dominique and the various resources we mentioned in the show once again thanks so much for listening and catch you next time
Original Description
I think you’re really going to enjoy today’s show. Our guest this week is Dominique Simmons, Applied research Scientist at AI tools vendor Dimensional Mechanics. Dominique brings an interesting background in Cognitive Psychology and psycholinguistics to her work and research in AI and, well, to this podcast. In our conversation, we cover the implications of cognitive psychology for neural networks and AI systems, and in particular how an understanding of human cognition impacts the development of AI models for media applications. We also discuss her research into multimodal training of AI models, and how our understanding of the human brain has influenced this work. We also explore the debate around the biological plausibility of machine learning and AI models. It was a great conversation.
The show notes can be found at twimlai.com/talk/23.
Subscribe!
iTunes ➙ https://itunes.apple.com/us/podcast/this-week-in-machine-learning/id1116303051?mt=2
Soundcloud ➙ https://soundcloud.com/twiml
Google Play ➙ http://bit.ly/2lrWlJZ
Stitcher ➙ http://www.stitcher.com/s?fid=92079&refid=stpr
RSS ➙ https://twimlai.com/feed
Lets Connect!
Twimlai.com ➙ https://twimlai.com/contact
Twitter ➙ https://twitter.com/twimlai
Facebook ➙ https://Facebook.com/Twimlai
Medium ➙ https://medium.com/this-week-in-machine-learning-ai
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from The TWIML AI Podcast with Sam Charrington · The TWIML AI Podcast with Sam Charrington · 27 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
▶
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Engineering Practical Machine Learning Systems with Xavier Amatriain - #3
The TWIML AI Podcast with Sam Charrington
How to Build Confidence as an ML Developer with Siraj Raval - #2
The TWIML AI Podcast with Sam Charrington
Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics & More with Clare Corthell - #1
The TWIML AI Podcast with Sam Charrington
Interactive AI, Plus Improving ML Education with Charles Isbell - #4
The TWIML AI Podcast with Sam Charrington
Machine Learning for the Stars & Productizing AI with Joshua Bloom - #5
The TWIML AI Podcast with Sam Charrington
Generating Labeled Training Data for Your ML/AI Models with Angie Hugeback - #6
The TWIML AI Podcast with Sam Charrington
Explaining the Predictions of Machine Learning Models with Carlos Guestrin - #7
The TWIML AI Podcast with Sam Charrington
Deep Learning: Modular in Theory, Inflexible in Practice with Diogo Almeida - #8
The TWIML AI Podcast with Sam Charrington
Emotional AI: Teaching Computers Empathy with Pascale Fung - #9
The TWIML AI Podcast with Sam Charrington
Statistics vs Semantics for Natural Language Processing with Francisco Webber - #10
The TWIML AI Podcast with Sam Charrington
Building AI Products with Hilary Mason - #11
The TWIML AI Podcast with Sam Charrington
Reprogramming the Human Genome with AI, w/ Brendan Frey - #12
The TWIML AI Podcast with Sam Charrington
Understanding Deep Neural Networks with Dr. James McCaffery - #13
The TWIML AI Podcast with Sam Charrington
Scaling Deep Learning: Systems Challenges & More with Shubho Sengupta - #14
The TWIML AI Podcast with Sam Charrington
Domain Knowledge in Machine Learning Models for Sustainability with Stefano Ermon - #15
The TWIML AI Podcast with Sam Charrington
Machine Learning in Cybersecurity with Evan Wright - #16
The TWIML AI Podcast with Sam Charrington
Interactive Machine Learning Systems with Alekh Agarwal - #17
The TWIML AI Podcast with Sam Charrington
Location-Based Intelligence for Smarter Marketing with Klustera - #18
The TWIML AI Podcast with Sam Charrington
AI-Powered Customer Support with HelloVera - #18
The TWIML AI Podcast with Sam Charrington
Using AI to Simplify the Programming of Robots with Cambrian Intelligence - #18
The TWIML AI Podcast with Sam Charrington
Increasing Efficiency of Healthcare Insurance Billing with NLP, w/ Behold.ai - #18
The TWIML AI Podcast with Sam Charrington
Creating a Worldwide Financial Knowledge Graph with AlphaVertex - #18
The TWIML AI Podcast with Sam Charrington
From Particle Physics to Audio AI with Scott Stephenson - #19
The TWIML AI Podcast with Sam Charrington
Selling AI to the Enterprise with Kathryn Hume - #20
The TWIML AI Podcast with Sam Charrington
Engineering the Future of AI with Ruchir Puri - #21
The TWIML AI Podcast with Sam Charrington
Deep Neural Nets for Visual Recognition with Matt Zeiler - #22
The TWIML AI Podcast with Sam Charrington
Introducing Psycholinguistics into AI with Dominique Simmons- #23
The TWIML AI Podcast with Sam Charrington
Reinforcement Learning: The Next Frontier of Gaming with Danny Lange - #24
The TWIML AI Podcast with Sam Charrington
Offensive vs Defensive Data Science with Deep Varma - #25
The TWIML AI Podcast with Sam Charrington
Global AI Trends with Ben Lorica - #26
The TWIML AI Podcast with Sam Charrington
Intelligent Autonomous Robots with Ilia Baranov - #27
The TWIML AI Podcast with Sam Charrington
Reinforcement Learning Deep Dive with Pieter Abbeel - #28
The TWIML AI Podcast with Sam Charrington
Robotic Perception and Control with Chelsea Finn - #29
The TWIML AI Podcast with Sam Charrington
Natural Language Understanding for Amazon Alexa with Zornitsa Kozareva - #30
The TWIML AI Podcast with Sam Charrington
The Power of Probabilistic Programming with Ben Vigoda - #33
The TWIML AI Podcast with Sam Charrington
Intel Nervana Update + Productizing AI Research with Naveen Rao and Hanlin Tang - #31
The TWIML AI Podcast with Sam Charrington
Video Object Detection at Scale with Reza Zadeh - #34
The TWIML AI Podcast with Sam Charrington
Enhancing Customer Experiences with Emotional AI, w/ Rana el Kaliouby - #35
The TWIML AI Podcast with Sam Charrington
Expressive AI-Generated Music With Google's Performance RNN with Doug Eck - #32
The TWIML AI Podcast with Sam Charrington
Smart Buildings & IoT with Yodit Stanton - #36
The TWIML AI Podcast with Sam Charrington
Deep Robotic Learning with Sergey Levine - #37
The TWIML AI Podcast with Sam Charrington
Deep Learning for Warehouse Operations with Calvin Seward - #38
The TWIML AI Podcast with Sam Charrington
Cognitive Biases in Data Science with Drew Conway - #39
The TWIML AI Podcast with Sam Charrington
Data Pipelines at Zymergen with Airflow, w/ Erin Shellman - #41
The TWIML AI Podcast with Sam Charrington
Web Scale Engineering for Machine Learning with Sharath Rao - #40
The TWIML AI Podcast with Sam Charrington
Marrying Physics-Based and Data-Driven ML Models with Josh Bloom - #42
The TWIML AI Podcast with Sam Charrington
Machine Teaching for Better Machine Learning with Mark Hammond - #43
The TWIML AI Podcast with Sam Charrington
LSTMs, Plus a Deep Learning History Lesson with Jürgen Schmidhuber - #44
The TWIML AI Podcast with Sam Charrington
Learning From Simulated & Unsupervised Images through Adversarial Training - TWiML Online Meetup
The TWIML AI Podcast with Sam Charrington
Jennifer Prendki Interview - Agile Machine Learning - TWiML Talk #46
The TWIML AI Podcast with Sam Charrington
Evolutionary Algorithms in Machine Learning with Risto Miikkulainen - #47
The TWIML AI Podcast with Sam Charrington
Learning Long-Term Dependencies with Gradient Descent is Difficult - TWiML Online Meetup
The TWIML AI Podcast with Sam Charrington
Word2Vec & Friends with Bruno Gonçalves -#48
The TWIML AI Podcast with Sam Charrington
Symbolic and Subsymbolic Natural Language Processing with Jonathan Mugan - #49
The TWIML AI Podcast with Sam Charrington
Bayesian Optimization for Hyperparameter Tuning with Scott Clark - #50
The TWIML AI Podcast with Sam Charrington
Intel Nervana DevCloud with Naveen Rao & Scott Apeland - #51
The TWIML AI Podcast with Sam Charrington
AI-Powered Conversational Interfaces with Paul Tepper - #52
The TWIML AI Podcast with Sam Charrington
Topological Data Analysis with Gunnar Carlsson - #53
The TWIML AI Podcast with Sam Charrington
ML Use Cases at Think Big Analytics with Mo Patel & Laura Frølich - #54
The TWIML AI Podcast with Sam Charrington
Ray:A Distributed Computing Platform for Reinforcement Learning with Ion Stoica -#55
The TWIML AI Podcast with Sam Charrington
More on: LLM Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · AI
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · Data Science
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · Programming
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · Python
🎓
Tutor Explanation
DeepCamp AI