Deep learning, science, engineering, research, and terminology

Gary Marcus
23 min readJan 1, 2020


A Dialogue between Yoshua Bengio and Gary Marcus

January 1, 2020

GM: Thanks for your last note, Yoshua, giving your definition of deep learning. I think you have your finger on something, and I definitely learned something from our conversation.

Whereas I am looking for a term that describes and analyzes the HOW of current research, you are really trying to characterize the GOAL of a research program. Those both seem incredibly worthwhile.

More broadly, I’m all for you defining your own terms. And you are right that much of the community is working on the research program you describe. You are also certainly right that the field has gathered many amazing priors already, and just as correct in observing that more needs to be done. There is a lot of truth in what you write above.

But I’m still uncomfortable with using a single term — deep learning — to describe both a research program going forward and a set of past models, particularly given how I understood people to be using the term historically.

YB: This research program has been going for decades and deep learning now designates a research area. Why does it make you uncomfortable for deep learning and not for machine learning, AI, reinforcement learning or kernel machines?

GM: Good point about generality. We do, for example, use machine learning both to describe a research enterprise and a set of existing models. I think the difference for me is that people don’t have a secondary meaning of machine learning as a hypothesis. People don’t write papers about how machine learning might be a good model of the brain, but they do write papers about how deep learning might be a good model of the brain.

YB: Let me try to clear up this confusion. There are two distinct enterprises, only one of them is called deep learning, the other takes inspiration from deep learning. There is first an engineering effort and a research area, deep learning, that is essentially part of of computer science and AI, and there we see inspiration from brain sciences fueling this research. However, let me remind you my own thesis (shared with many): there are a few simple principles (call them priors if you want) which could explain animal and human intelligence — based on learning — and also allow us to build intelligent machines. So there is another research enterprise and it is carried out mostly by a different group of people, in neuroscience and cognitive science, which takes inspiration from deep learning in order to come up with theories about the brain and cognition. Those theories are articulated around some of the principles which deep learning has found to work well in AI. Of course, much of our strength in science comes from the synergies arising in multi-disciplinary interaction, and neural network research has a long history of such interactions. Yann LeCun and I have taken over a research program funded by CIFAR and created by Geoff Hinton called Learning in Machines and Brains, precisely to foster that synergistic research endeavor, and which funded early work in deep learning in this century.

GM: Let’s talk about two quotes from your 2016 text:

Deep learning is a particular kind of machine learning that achieves great power and flexibility by representing the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones


The quintessential example of a deep learning model is the feedforward deep network, or multilayer perceptron (MLP). A multilayer perceptron is just a mathematical function mapping some set of input values to output values. The function is formed by composing many simpler functions. We can think of each application of a different mathematical function as providing a new representation of the input.

YB: MLPs are an example, but an example of a concept is not the concept itself.

GM: Good point! But we do often need superordinate terms, like vehicle to refer to cars, trucks, boat etc. What I want for a superordinate term is something to describe that set of (say 2016-era) models, and as an outsider it seems to me that this is how many others have used and understood the term for years, even though I see know that you have always had something different in mind.

I also now see how the earlier instantiation flowed from your research goals, and see how a new instantiation, with plenty of focus on attention, gating, pointers, etc is already starting to emerge. I also recognize now that many of the newer texts have some historical precedents.

YB: Thanks for acknowledging that.

GM: It’s become much clearer over our recent email exchanges; I am sorry I didn’t originally grasp what you are trying to do.

As I see it, we still need a way of talking about the sort of thing you presented in 2016 when you wrote about it as you did above. Whatever term you might use going forward to describe the research program you have in mind (compositional in your sense, neurally-inspired, learning-focused, etc), we still need some way of picking out specific, existing models, such the cluster of recent but perhaps no longer entirely current architectures that were the main focus in this 2016 text (CNNs, DNNs, RNNs, multilayer perceptrons).

YB: Sure. But let’s not call it ‘deep learning’. Let me explain why. There is obviously a strong reaction against deep learning from various camps who probably feel frustrated that this research program has attracted so much attention, funding and industrial success, in comparison to whatever their pet research theme might have been. You, yourself, are perceived by most of my colleagues as attempting to devalue ‘deep learning’. So your proposal can be interpreted as yet another attempt at DL-bashing, rather than participating in a honest scholarly discussion about scientific terms and research objectives. I don’t know who is right or not, but you should understand this is how many of your actions are perceived, as you consider your next move.

GM: Definitely not my motivation, but I can see how it might look that way. I am not a huge fan of the earlier classes of models, things like multilayer perceptrons, but mainly because I think they have been oversold, not because I don’t think they have some uses. I’m pushing on the terminology because I want the field to develop a broader toolkit, because there is a certain set of challenges that aren’t currently being met or even addressed, and because I have seen a lot of researchers — not you — react defensively even to carefully characterized limits.

So here is where we are, as I see it.

You’re not comfortable with my term core deep learning as a stand in to describe extant techniques like mutillayer perceptrons and CNNs and RNNs that were the staple of deep learning for some time. I am ok with having some alternative, but would like some term.

I am not comfortable not having a way of describing those techniques, and not entirely comfortable with using deep learning to describe a research program that is open-ended, when so much prior literature seems instead treating deep learning as a specific family of models.

My discomfort comes from a few places, let’s talk about those.

YB: Sure.

GM: One is a desire to make scientific statements about the capabilities and limits of various classes of models

YB: I think part of your problem, maybe because of your background in cognitive science, is that you are still putting in the same bucket deep learning as a research area in computer science and deep learning as an inspiration for theories about the brain and cognition. It is normal for a research area to evolve and grow its set of concepts and tools. Its main thrust remains, though and I would like to remind you that deep learning is about learning powerful representations. Our toolbox is about all the ideas, methods and architectures we are exploring to achieve that goal. I can understand that you want a name for some of those neural net architectures which have been the main ones studies in the past . Why not call them by their names, e.g., MLPs and convnets, if that is what you want to study the limitations of.

GM: I am ok with that in a technical context, as long as there is some sort of superordinate term, and actually carefully pitched my conclusion slide in our debate in roughly that way — I said that my concern was re “homogenous, multilayer perceptrons” — but we can’t really use those terms in public sphere, and deep learning is very much in the public sphere…

YB: I don’t see a problem with using the term MLP or convnet in the public sphere. In fact they are already fairly commonly used. I am not sure I would include the term RNN in there because there is a sense it is a much more open-ended set of models (e.g., it includes models with gating mechanisms which could implement the kind of reasoning mechanisms I have been talking about in my recent talks), but it also includes models like vanilla RNNs (this is actually a precise term) and LSTMs which are specific models you can pick on.

GM: Agreed, RNNs are trickier. They share some properties with multilayer perceptrons but are also, as you say, more open-ended.

Another aspect of discomfort comes in my engineering hat, with my company Robust.AI, engaged in building software for robots We need to be able to understand the capabilities and limits of various classes of model, so as to talk about which tools to use for which aspects of the problem my team faces. It is critical that we be able to separate potential future research from the capabilities of the current tools we have available.

YB: With your engineers, I don’t see why you could not use terms like MLPs and convnets, I’m sure they can understand that.

GM: Sure, but sometimes we need a superordinate term. There’s also a desire for philosophical clarity. It is coherent say “model X” or “class of models C” has some property (eg sets records or fails to extrapolate beyond a training distribution), but it is incoherent to say that a research program with no absolute commitments has any properties at all. Statements like “deep learning has solved problem Y” (eg speech recognition) become incoherent if the term deep learning going forward references only a research program rather than a set of models and a desire for progress.

YB: I disagree. Again, you are viewing deep learning as if it was a theory of the brain. It is a research area in AI. See my above distinction between neuroscience theories inspired by deep learning and deep learning itself, which is a research area in AI. Does the word “machine learning” make commitments about a particular algorithm? No. Same for “kernel machines” or “AI”. And yes, we can say things like “machine learning has helped to solve industrial problem X”.

GM: It is an important realization, for example to recognize, as we both do, that multilayer perceptrons are not the right tools for extrapolating beyond a training distribution. Only once there is a clear notion of problems need to be resolved can progress to be made.

YB: I am not even sure that the problem with MLPs is the architecture. I hypothesize that we need dynamically reconfigured connections but it may also be that the key is going to be changing the training framework (e.g. a la meta-learning).

GM: Fair point; the issue I see is really with a conjunction of architecture and (standard) way of training that architecture; rethinking the training framework is definitely one way to go.

Another issue I have is a concern about conflict with the ways in the term deep learning has been used in the past. I accept that you have been thinking about deep learning in the broader, research program sense all along (even if I didn’t catch that initially) but a lot of claims in the literature retrospectively still seem to me to be framed around the stuff I want to call core models. Sentence in the prior literature like “deep learning broke records” make sense when one read them as referring to “specific models of a certain sort” set records.

YB: These sentences also make sense when interpreting DL as referring to the research area. That research has indeed yielded a number of outstanding record-breaking successes. EACH OF THEM USED A DIFFERENT MODEL. What they have in common are the underlying principles guiding the research.

GM: Fair point; I see what you are saying. But it make less sense when when one reads the sentence as “a research program set records”.

YB: I totally disagree. We also common talk about the advances achieved with machine learning or the advances achieved with reinforcement learning, which also are research areas. Please give me one reason why the same sentences can’t be interpreted as talking about the broader approach. In fact interpreting those sentences as referring to a single specific model makes LESS sense because the specific models keep changing, yet we continue to refer to ‘deep learning’ or ‘machine learning’.

GM: Well the thing is that the records are set by specific models; those models derive from the research program. I think it’s important to keep clear.

Of course it’s fine (even salutary) for research programs to evolve, but drift in word meaning can causes problems.

YB: There is no drift at the level of goals. All of the deep learning models from the beginning satisfy my definition. Of course, the individual models are different in each paper.

The only real CHANGE is that we keep adding tools to the deep learning toolbox and that makes sense in view of my definition, which is about general principles and goals.

GM: I get that your adding tools per your general principles, and have no issue there; I do think many understood the term differently before, at least in the lay public.

YB: I don’t think so. I think that lay people understand deep learning as a research area, not as a particular piece of code. Again, why don’t you pick on ‘Machine Learning’ as overly broad and ‘drifting’?

GM: You raise a good point here; ML certainly drifts too. I think to me the issue is a bit like the contrast between physics, which is a field that we expect to grow over time, versus a specific theory of physics, like Newtonian physics, We want to be able to say things like Newtonian physics was an improvement over Aristotelian physics, but Newton still didn’t capture everything.

If you keep doing what you are doing, I am pretty sure that the stuff you are doing in 2025 will be a lot better at what you want to call System II than the stuff that made me cranky in 2018, precisely because it was all System I and wasn’t very good at the System II stuff. I want to see the new research and what you come up with, but I also want to be able to trace the intellectual history, and to be able to talk about which tools do and don’t work for particular problems.

YB: Again, you are trying to define deep learning into something it is not. It is not a theory about the brain (although it should inspire such theories). It is in the same meta-category as machine learning. It is a research area. And by the way, I am not the one defining deep learning. A term is defined by its usage in the community, in English. And this is how it is used.

GM: What matters to me is what kind of architecture is actually going to work, either for AI, or for understanding the human mind. Calling everything deep learning seems to obscure which aspects of architecture matter.

YB: Then talk about specific architectures, like MLPs and convnets. By the way, I don’t think that my deep learning colleagues, like Geoff Hinton or Yann LeCun, ever thought that the brain was an MLP or a convnet. These have been tools to study ideas and principles, some of which may explain part of what we see in the brain.

GM: My contention has always been that neural networks might work and when they became more inclusive, and started allowing in much more complex structures, particularly machinery for operating over variables and complex, structured representations. What I want to know, as a human being who invested decades in that is, of course, was I right? Do we need those things?

YB: Now we are getting closer to a more interesting question than whether the name of a research area is appropriate or not. You have a particular mental picture of rules and symbols which might be somehow implemented in the brain. I have a different mental picture in which I see something which could maybe resemble logic rules is found in the brain but is actually quite different in its details from the GOFAI vision of this, relying instead on the distributed representations, gradient-based learning and other concepts which we are building in deep learning.

GM: I would love to discuss this more in our next conversation!

For now, here’s my immediate concern : collapsing everything under one umbrella could obscure answers to questions like this; which architectural commitments actually matter?

Deep learning, in its homogenous form (eg multilayer perceptrons) is basically an extension of the “eliminative connectionist” position that tried to say “no rules in the head”; a new form of deep learning like you are after — with a different, more inclusive attitude could drop that commitment — and it might work well.

That’s essentially what I was calling for in The Algebraic Mind. I’d love to see that. I do want some clear language though, so that people realize that there is a lot under the umbrella, that’s it not one size fits all, and that some hypotheses about architecture work better than others — quite possibly including some that I defended against great opposition for a very long time.

YB: But we do have names for types of models and approaches within deep learning. We talk about MLPs, we talk about fully-connected layers, we talk about multiplicative connections, we talk about attention mechanisms of various kinds. In the deep learning scientific literature, we have lots specific names for different models, different types of trainable modules, different types of non-linearities. This is where the action is.

GM: Back to the terms, I think am far from alone in seeing your current definition as differing from how many people historically have used the term.

YB: I am far from alone in agreeing with my definition.

GM: [Laughs] I Don’t dispute this either.

YB: Not just the pioneers, but I bet that if you made a poll among the deep learning ***researchers*** (e.g. who publish at ICLR) the vast majority would agree with me.

GM: That could well be; I know Tom Dietterich for example sees things similar, as a researcher sympathetic to deep learning, though not a deep learning researcher per se.

YB: Now outsiders who may have a biased view of what we do may disagree, and that is a different issue.

GM: I am not sure it’s bias per se…

YB: A lot of the problem comes from the fact that deep learning has become a term used by journalists and CEOs and even became confused with the term AI in many circles.

GM: I think this is close to the crux of it; you very reasonably don’t want outsiders to tell you what to call your field, and I don’t really want to being doing that. But the flip side of the coin is that a lot of people outside the field now understand the term in a different way than the one you intended. CEO and journalists, and also, I think, a lot of folks in government, laypeople, etc.

Not entirely sure what’s the best way forward, but am hopeful that our having had this conversation will shed some light.

YB: Sure.

GM: Just a few other points, and thanks for being so patient. Here’s one: as a scientist, I have a distaste for claims that are too open-ended.

YB: It is not a claim as in cognitive science, it is a research area. Again, please consider your same words but replacing ‘deep learning’ by ‘machine learning’. Are ML researchers making a claim? Yes, but a very broad one, which is that systems which learn can be very useful. So, we don’t think of it as a claim but rather like a research area. DL researchers are ML researchers and they make more specific ‘claims’, firstly that learning representations can be very useful. In the service of that, they have found other broad open-ended concepts to be useful, like ‘gradient-based optimization’, which they find efficient to serve the higher goal of learning good representations.

GM: I see that, but I also frequently encounter claims like “deep learning is a good model how vision works”, and I don’t really ever hear claims like “machine learning is a good model of how vision works.” I definitely agree that though that some of the ways in which you and I have talked past one another comes from your perspective of deep learning as a research area versus my perspective as a cognitive scientist/cognitive neuroscientist who is often asked re: whether deep learning is a good model of cognition (and never asked whether machine learning is).

YB: Right, so let us not call that neuroscience research ‘deep learning’, but instead deep learning inspired neuroscience. And presumably there would be specific models involved, like specific convnet architectures. Neuroscientists understand full well that the brain is not using exactly the same algorithms as convnets, but it is plausible that there are enough functional similarities (e.g, the brain might be estimating gradients in a different way, probably much more noisy) that the resulting representations can have a lot in common.

GM: Or (generalizing your point) convnet-inspired neuroscience!

To try to put our discussion in terms we might be comfortable with, deep learning is for you a research program, but some of the specific models that have come out of the program are known by lay people (and many professionals in neighboring disciplines like neuroscience and cognitive science) as deep learning systems.

YB: Yes, there are software systems based on deep learning.

GM: It’s a legitimate question to wonder those specific existing systems — whatever we call them — are a good model of mind a brain. I see you as being perfectly consistent in your own usage, but see a different set of people as using the term in a different way.

YB: I already answered that. I believe that concepts studied in deep learning research can inspire models of the brain. For example, I have worked for many years on credit-assignment mechanisms (aimed at adapting synapses) which are analogous to but different from backpropagation and attempt to be more biologically plausible. Such papers indeed propose specific forms of computation as theories of some aspects of the brain. It is those computations — as candidates for phenomena in the brain — which reviewers evaluate when they review such papers. It is not deep learning in general as a field, and certainly not as a theory of the brain.

GM: I am much more comfortable when these things are stated specifically.

Here’s another issue. In your most recent definitions I don’t see any real commitments as to what might or might not be in the future scope of the term (e.g., gradients might or not be included, MLPs that were once quintessential are explicitly envisioned as potentially not being part of the obligatory scope; Jeff Hawkins’ HTMs are usually seen as an alternative path to deep learning, but under your definition might count as deep learning, since they are heavily brain-inspired). Falsifiability seem fundamental for any scientific claim, and I don’t see how to apply it here, because the ultimate scope is very much still be worked out.

YB: Again, you’re trying to interpret a CS research area or a research program as if it were a scientific theory. First of all, we are more in the realm of engineering here (although I think that our findings can seed corresponding theories of the brain). Engineers build tools. The only ‘claim’ of a tool is its usefulness in some contexts. It is not ‘true’ or ‘false’. It is more or less useful as a conceptual device to build machines. If you wanted to cast deep learning as a scientific theory, it would have to be a theory of intelligence in brains, and you might have something like: the brain learns multiple levels of internal representations in order to make sense of the world and act in the world.

Since the details are not specified, there are many ways this theory could be specialized. For example, there are many people in neuroscience who explore the theory that in order to learn such representations, the brains approximates gradients of a loss function to modify synaptic strengths. That is a theory and people are doing actual experiments to try to falsify it.

GM: Here I think we agree, but stand in different places. A lot of our disagreement has indeed come from me wanting to look at deep learning through a lens of scientific theory, whether that theory be one of how the brain works or one of how intelligent systems might work. I don’t have any problem with taking a broad approach towards trying to engineer solutions. I just want to have a way of talking about the scientific questions.

YB: For that you will need to look at the claims made by specific papers, published in a brain sciences venue.

GM: Another concern I have is that some worthy ideas won’t get a seat at the table, because they don’t fit squarely under an umbrella that is broad yet in some way sociologically rather than mathematically or conceptually. A definition that is mathematical or conceptual means anyone can play; a definition that restricts things to what one community develops seems to me to be an invitation to exclude others — based not on content but on community membership. It feels more who than how.

YB: I don’t understand your last point, sorry. First you say that Deep Learning is too broad and know you say it is too restrictive?

GM: What I mean is that the term is too general from a scientific perspective, but still defined somewhat personally, and in that way could be restrictive to others. This last could have huge economic consequences in terms of how funding is distributed, cementing the position of an already extremely well-funded in-crowd, at the expense of outsiders that might have good ideas that are slightly off the mainstream axis.

YB: Please clarify because I don’t understand what you’re trying to say. Our market-based economy rewards tools and approaches which deliver value (with a lot of noise in the process, admittedly, and also sometimes at the expense of justice and equity). Papers which are submitted in the ‘deep learning’ category in a conference tend to be deep learning papers. At the same time, there is a reason why they fall under that umbrella. I have clarified my own view in a definition centered on learning complex and useful representations (which means obtained through the composition of learned non-linear operations).

GM: Overall I agree on the field being a market-based economy rewarding value, but I worry that if a research program is defined in terms of what a set of people are doing, rather than absolute terms (e.g., this or isn’t a multilayer perceptron), there is a risk of adding a kind of political noise to the evaluation metric. (Probably this already happens to some degree with terms like AI and machine learning, to be sure; not saying that the problem is unique). And that in turn could have serious, negative scientific consequences, if good ideas from outside the community wind up starved.

YB: But you have both a meaning out of usage (by the deep learning community) and an emphasis on representation learning (hence the ICLR conference). Regarding your concern about the term being too restrictive, please note that the neural net crowd has always been one being extremely multi-disciplinary, in open discussions with many disciplines, and in fact importing ideas from many camps. I don’t see why it would stop. I’m interested in importing ideas from GOFAI to build deep learning system 2 capabilities, for example. But also from those who study causality. From neuroscience, from cognitive science, from game theory, from information theory, from philosophy of mind, etc. I don’t understand why you say these things. Geoff Hinton, Yann LeCun and I have promoted a multidisciplinary spirit, as witnessed by the Learning in Machines and Brains program of CIFAR.

GM: My experience is rather different. Every time I have tried to suggest that there are some limits, I have simultaneously urged for partnership and hybrid models. There is often pushback. On the day when the 2019 Turing award was announced, a reporter for Bloomberg asked Geoff Hinton what he thought of deep learning-symbolic hybrids; Hinton’s response was disparaging, not welcoming. In the words of the reporter, “[Hinton] compared this to using electric motors only to run the fuel-injectors of gasoline engines, even though electricity is far more energy efficient.” I get tweets all the time saying “the war between connectionism and symbols is over. symbols lost. get over it.”

I love that you are open-minded; that’s not always the case.

YB: I’m sorry that you are going through these feelings.

GM: I really appreciate that.

YB: In my mind there is a big difference between the GOFAI symbolic rule-based computation of the kind that Geoff Hinton talks about in this quote and the way I am thinking about deep learning implementing things like indirection, modularity, categories, etc. My bet is that deep learning variants can achieve the form of symbolic-like computation which humans may actually perform but using a substrate very different from GOFAI, with limitations similar to what humans experience (e.g. only few levels of recursion), and circumventing a major efficiency issue associated with the search problem in GOFAI reasoning in addition to enabling learning and handling of uncertainty.

GM: I am looking forward to discussing that soon!

YB: You have to understand Geoff’s point of view too. When he was a young researcher, connectionism and neural nets were severely rejected by the mainstream AI and cognitive science research communities. It is not surprising that he now feels that history has demonstrated that this rejection was a big scientific mistake.

GM: I get that, and even wrote sympathetically about that in 2012, crediting him for persevering.

Thinking about a related (though not identical) set of issues to the discussion we are having about how to think about deep learning and what to call it, Yann LeCun made a suggestion in 2018 that I quite like: he suggested introducing a new term called differential programming. As he put it, this would refer to a “new kind of software [that works] by assembling networks of parameterized functional blocks and by training them from examples using some form of gradient-based optimization.”

YB: Yes. It is another broad category, which intersects a lot of deep learning research.

GM: This seems to well-capture what a lot of people are working on now, and it doesn’t overhype things: it conveys that there is a particular approach that is essentially a set of tools, and it conveys something about how those tools work. It does not imply a level of conceptual sophistication (“depth”) that has not yet been achieved. It doesn’t invite the notion that it is a solution to artificial general intelligence; it does imply that it’s a really interesting new way to solve problems, which it is. Tom Dietterich made a similar suggestion recently, writing that “”DL is essentially a new style of programming — ”differentiable programming” — and the field is trying to work out the reusable constructs in this style. We have some: convolution, pooling, LSTM, GAN, VAE, memory units, routing units, etc.”

What you seem to be after is not quite the same: Yann and Tom are after a way of talking about set of programming techniques, and more broadly an approach to programming.

What you are talking about, if i know understand correctly, is instead (though relatedly) an approach to research, and it’s bold one.

YB: I am talking about deep learning as a research area. I am also talking about building up on the deep learning tools we already have to expand its reach towards system 2 capabilities.

GM: Yes, I see the analogy you making. More broadly, I think it’s thrilling that you are willing to consider abandoning some of deep learning’s historical roots (like gradients and MLPs) where necessary in order to face hard problems, following things where they take you without being dogmatic.

I love that!

Any research program that brave and open-minded certainly deserves a name, not to mention plenty of funding and smart students of the sort you’ve got at MILA.

But is deep learning really the right name for that research program, given the history of the term’s usage, and given your own willingness to depart from the past and to adventure full-steam ahead into the unknown?

YB: Yes, this is still part of deep learning research, and these attempts are not really new and started in the 90s. That being said, I think it is also fine to invent new terms to talk about the specific forms of deep learning computational capabilities which psychologists have labeled ‘system 2’.

GM: Ok, too bad I didn’t convince you (can’t blame a guy for trying!), but I think at least we are getting to understand each other better. One last question: It seems that you are willing to journey into parts unknown; why name the destination before you’ve arrived?

YB: Because it is a goal, an approach to AI. Just like machine learning wants to build machines which learn. We have made progress on that but everyone agrees that more needs to be done. That statement applies both to deep learning and machine learning, albeit deep learning is a subset of the approaches which machine learning explores.

GM: I get that. For me, though, I am still a bit stuck on the clash between what maybe we could call the public use and the private use, and for you what matters is what you want to work on.

But I think it’s ok if we disagree, and I wish you the best of luck on your research going forward!

YB: Thanks, and Happy New Year!

GM: Happy New Year, to you, too. I learned a lot.



Gary Marcus

CEO & Founder Robust.AI; co-author (with Ernest Davis) Rebooting.AI. Also proud dad, Founder of Geometric Intelligence, acquired by Uber, & Emeritus Prof., NYU.