GitHub Next: How Their Research And Prototyping Team Operates

Transcript

Shane Hastie: Good day, folks. This is Shane Hastie for the InfoQ Engineering Culture podcast and we are having great fun getting started today, but I’m sitting down with Idan Gazit and Eddie Aftandilian. Did I get that close enough?

Idan Gazit: Success.

Eddie Aftandilian: That was good.

Shane Hastie: Gentlemen, welcome. Thank you for taking the time to talk to us today.

Introductions [00:58]

Idan Gazit: Thank you. Thank you so much for having us,

Shane Hastie: You’re both from GitHub Next. What’s GitHub Next?

Idan Gazit: That’s an excellent question where what would normally be called a long bets team or an innovation team or a labs team, I like to describe us as the department of fool around and find out because we’re there to try things. It says research on the door, but the reality is that we’re a prototyping team.

Eddie Aftandilian: I mean, when we first started we did a lot of things, but we pretty quickly focused down on AI. And as Idan said, our job is to prototype new ideas and test viability. And if we come across something that looks like it might be viable, help it graduate out and become a real GitHub product.

Shane Hastie: And before we go deeper into the team and the products, who’s Eddie? Who’s a Idan?

Eddie Aftandilian: Okay, I’ll go first.

Idan Gazit: Go for it.

Eddie Aftandilian: I’m Eddie. I’m a principal researcher in Next, I lead and manage about half of Next, and Idan manages the other half. My half is sort of the more sort of research-y focused people in Next. So we have people who have backgrounds in machine learning and programming languages. We bring I guess the rigor to Next projects. Idan’s half of the team, I’m going to step on your toes here, but Idan’s, half of the team is sort of more front end developer experience focused. They build really cool prototypes of things and we demo those things and sometimes people get excited about the demos, but in this AI world, it can be very easy to build something that’s cool and actually really hard to make it work for real as a tool. And I see the goal of my wing of the team as taking these cool ideas and making them work reliably.

Sometimes that involves things like building evaluations that we can use to measure how reliably these things work and then we can drive them up. Sometimes it involves taking results from the research literature and then applying them to Next projects. So making this match between things that are already known in the research literature with actual problems that we have. And then sometimes it involves actually coming up with new techniques to solve some problem that we have in a prototype or a real GitHub product. I’ve been in GitHub Next for about four years. I joined at the very beginning of the Copilot project, so I joined to work on Copilot. I was one of the original Copilot team members.

I helped take it from this original hacky prototype all the way to general availability. I worked on it for about a year and a half. I ended up for a while leading quality evaluation and improvement for Copilot for about a year. And then I came back to Next to work on new projects. Before that, I spent 10 years at Google working on internal developer tools. And before that I did my PhD in programming languages at Tufts University in Boston. And so the sort of through line of my career has always been about developer tools, building tools to help Developers be happier and more productive.

Idan Gazit: Hi, my name’s Idan. That’s a tough act to follow, Eddie. I currently lead GitHub Next, but for most of the time that I’ve been here, I joined about five years ago. So just a little bit before Eddie and for most of the time that I’ve been at Next, he and I have sort of been peer managers running. Now I run the group, but we still sort of have our domains of expertise. And like Eddie says, my entire career has also been about developer tools. Prior to GitHub, I was at Heroku and I was a core contributor to the Django web framework and spent a lot of my life in and around open source. And generally speaking around web technologies is sort of my home base, but interaction design, interfaces, user experiences, cognition, perception, these are the things that are generally my job.

And nowadays in the context of GitHub, it’s exactly what Eddie said. It’s bringing both halves of the house to bear on these problems. And I think something that’s interesting about Next is that we don’t execute along reporting lines. It’s not that Eddie’s folks are working on Eddie’s projects and my folks are working on mine and never the twain shall meet instead it’s most challenges to overcome them or to really advance the state of the art.

We have to bring in some folks that have the hard background in machine learning and maybe we need to train a model or we need custom techniques into the models, but then we also need to create the right interface to that and use that to elicit information from the humans using it that is going to make the AI succeed at their thing. So it’s this very interdisciplinary squad. It’s a real joy. It’s one of the most unique teams I’ve ever seen anywhere and a privilege to work alongside everyone. I think that’s all the stuff that matters to me. You’ll hear me rant a lot about nouns and interfaces and I’m really into typography and color and all of those things. But yes, I’m a straight-up nerd.

Shane Hastie: What’s the take to hold a team like that together?

Leading an innovation team [06:18]

Idan Gazit: We have a cycle that we’ve now gone through a number of times and not all parts of the cycle are fun. It sounds really great. It’s the team of permanent hack week, right? Nobody’s saying, “Listen, here’s the Gantt chart, go execute”. In fact, generally speaking, Next, it’s an undirected exploration function of the business. If GitHub’s leadership wants something, they can direct parts of the business to do something. But in our case, what I think is expected of us is that we need to create options for them. We need to roll up to them with here’s a bet that the business could make and here’s the evidence that supports this bet, right? We can’t roll up to them with a doc and a philosophy. We have to roll up with something qualified that’s going to persuade them to spend the resources, not the 15 people that are Next, but the much bigger, broader GitHub.

And so we have this cycle and we’ve now been through it enough times that there’s some pattern. The first part of the cycle is this blank canvas dread phase of we don’t know what we’re making and we’re trying to find something like which spot do we dig in? Where do we see glints of gold? And so I’d say in that phase, the thing to inspire and rally the team is setting, maybe it’s direction setting is the word that I’m looking for. It’s saying these are the strategic directions that are valuable for us to be digging in and figuring out what those are is not easy. It’s quite a bit of prognostication about where the tech is going and where the market is likely to be in a year or two. Because at the end of the day, if what we want to do is make things that graduate and escape the lab, then they need to have a business future.

It can’t just be like, “Oh, this is a technically beautiful thing, but I don’t know what application it has that people are willing to trade money for”. So that’s the hard phase. And then the rest, once we’ve sort of selected projects, that’s the happy phase. There it’s like I think folks are internally driven because they’re excited for the making. And ironically, that doesn’t require much in the way of pushing it all because we hire people that have this zero to one mindset. It makes hiring very difficult because it’s this very intangible quality. If I go up to my recruiter, they’ll be like, “What do you want? You want a front end engineer, you want a back end engineer?”

Hiring people with the right fit [08:52]

I want people who make, and they’re like, “I don’t know how to filter that in a resume”. And I’m like, “I know, that’s really hard, but that’s what I mean”. And so by hiring correctly, that second phase becomes a lot easier. And then the third phase is the storytelling, the going to market, the evaluation and the research on the things. We field a prototype and then we’ve got to go and study how people use it, what value are they getting out of it because that’s the evidence that I need to produce in order to roll up to leadership and say, “Here’s what”. So it’s different things that we need at different phases of this cycle.

Eddie Aftandilian: I mean the thing about sort of hiring the people with this zero to one mindset is important. And I think it’s important when we make a point when we hire to communicate what this is really like to people because it can sound it’s like all sunshine and rainbows that it’s always exploration, but actually the exploration, like you said, is it’s often the hardest part where you’re trying a bunch of stuff. Most of it doesn’t work because that’s the nature of what we do. We try to set ambitious goals and then, I don’t know, I think we say 80% of the stuff we try doesn’t actually work. And that part can be really demotivating. And if you’re not willing to sort of stick it out and have trust in the process, I don’t know, you can get pretty down and it’s important to know what you’re actually getting yourself into when you join Next.

Idan Gazit: Yes, absolutely. I mean I might add something onto that, is that maybe the hardest kind of classification that we have to do in that early phase is not even when it’s clearly not working. That’s an easy classification to make. The hard part is what if it’s cool and it might be good, but we don’t know if it’s really going to be good. And then how do you distinguish the things that are the potentials from the real deal? And I wish I had a flowchart for that, but I don’t, every experiment has its own parameters.

Eddie Aftandilian: You can get distracted by the coolness of something, right? You said in the end, we need to produce something that people want to trade money for. And often the cool thing is not exactly that. So it’s a hard process to go through.

Idan Gazit: Also, you have to let go of everything that you make. Another thing that I’ve learned over the years of hiring folks into Next is that in the hiring process, I’m like, “Listen, every time you succeed, you’re going to give your baby up for adoption”. And that’s what success looks like. You hand it off and you hope that it has a good life, but you can’t guarantee it. And on occasion, we’ve had at least one or two folks that have gone with, they’re like, “No, I want to keep working on this in perpetuity”. And this is maybe the happiest form of attrition we experience at a group like Next is when somebody is just like, “Nope, I want to stick around for the lifetime of this baby and see it grow up”. Then they depart with a thing into engineering, and that’s a win all around for the business. We’re sad for the loss to our team, but it is strictly in the good column for the overall business.

But that’s something that I have to be upfront about because for the things that we try and don’t work out is like we hand it off in the good case or we shelve it, which is the other core activity that we engage in the team is not fooling ourselves when something isn’t working because opportunity cost. If it’s not working, maybe it’ll work in a year or two when the models get better, when we have another brainstorm and come up with a better technique, I don’t know. But right now it’s not working, instead of us throwing “good money after bad”, let’s stick it on the shelf and turn our attention to something else. And that kind of honesty is really hard, particularly when you’ve spent the last month really scrabbling with your fingernails to find purchase on a topic to then be like, “You know what? I give up on this”, can feel like a punch to the gut, but that’s the job.

Shane Hastie: Can you tell us a story of something that looked really interesting but then just didn’t make it?

Examples of successes and failures [13:05]

Idan Gazit: I have one which is Blocks. This one was very personal to me. I was the one who led this thing and I was the one who had to kill it when it didn’t take off. And so Blocks was, this was not even really an AI specific thing. It was this notion of what if we can GitHub itself extensible beyond everybody’s seen now. It’s like if you have a markdown file in GitHub, you don’t see the raw markdown, you see the rendered markdown, and if you have a mermaid diagram in GitHub, you see the mermaid chart, you don’t see the source of it. So we were like, okay, let’s take this concept way, way further and be able to have small applications that you could publish that you could then view your repository instead of seeing a list of files, see it another way or have small applications in your repository. So it’s not a full on platform for deploying applications like a Firebase or Vercel or a Heroku or whatever.

But miniature apps specifically tied to your code bases. And we had a lot of great early signal from the earliest of early adopters. We have a discord and the kinds of folks who self-select into like I want to play with tomorrow’s tools today and I’m willing to endure a lot of stuff being broken for the privilege. And so they came along and they made a lot of really cool blocks. We call them these miniature applications, but at the end, it never crossed the chasm. It never achieved that sort of status, even though there were definitely customers that were asking us like, “Hey, we’d really like to extend GitHub. Is there ways of doing that?”

And there are extension points for GitHub, but nothing that lets you really affect the user experience of GitHub itself. And so we shut it down with a heavy heart and everything. I sent out the shutdown email and that was awful and I’m glad that we did it, but I think that’s a good example of killing your darlings in exchange for being able to pursue new darlings. So that’s my example of something that didn’t pan out.

Shane Hastie: And what about something that did? What happened with it?

Eddie Aftandilian: I mean, Copilot is the big one. So copilot was created within our team. It was the first big success for Next. It was huge and it did turn out to be huge. We didn’t know that at the time, but it started within the team. It started out as a collaboration with Open AI where we got access to their co-generation models, which were distinct from their other models at the time. And we were sort of tasked with figuring out, well, how do we make this thing actually useful people? And so we built different interfaces.

The one that was obviously the right interface for it was this code completion in your IDE and VS Code format. Once we landed on that, it turned into sort of a product, like a productionization march. And for that project, we did most of that within Next. So all Next projects that succeed, we take the technical preview, which means it’s like a closed preview, usually behind a waitlist, people have to apply and we select them as we have capacity, and we usually do that within the team.

And as a Idan mentioned, we start at that point doing user studies and collecting data about what users do, collecting survey feedback in order to make a case about whether to take this full production or not. So we usually do that within the team. We did that for Copilot, and then once we hit the technical preview phase, it became pretty clear to everyone that this was going to graduate and become a real project. And at that point, the team started growing. About halfway through that cycle, we were moved into GitHub engineering. So those of us started in Next on Copilot, moved into engineering, we went on loan. I was on loan for nine months to engineering. We helped take it all the way to general availability, and we also helped build the team there. We did the hiring and such that would make it a sustainable project within GitHub. And then we came back to for the next round of projects.

Shane Hastie: You made the point, there’s no crystal ball. How do you choose the bets?

Decision making and bet selection [17:29]

Idan Gazit: Well, first of all, I think when you say a statement, how do you choose? It implies that there’s this one checkpoint. It’s like here’s the do or die line beyond it, we got to do it. And the reality is that it’s almost never that. It’s almost always a ramp. The moments where you look at something and you’re like, “Oh, snap, that is definitely, definitely going to be a thing”. Those are few and far between. I think original Copilot was the strongest I’ve ever felt that about anything where it’s just you look at it and you’re like, “This is the future. I’m not going to work without this in the future”. The rest of the time you have mixed signals. And so the way that we structure our explorations are that effectively every exploration kind of has a dead man switch on it in the sense that it’s always about to run out of its lease on life.

And at any moment, we’re always asking ourselves, what is the strongest small piece of evidence that we can produce in this iteration that’s going to persuade us to extend this project’s lease on life? So right now, can we get to a chewing gum and duct tape prototype that we could kick the tires on ourselves? And then we’re it heavily and using it and being like, do we believe in it or not? And then we’re done with the kind of validation that we can do just ourselves. It’s time to expose it to more people. Well, there’s quite a large business of developers at GitHub, some of whom are like, “Yes, I want to play out round with something that may or may not ever be good”. So we’ll go to them and we’ll be like, “Hey, who wants to kick the tires on this and give us feedback?”

We have external developers that are close to GitHub, like the GitHub Stars program. These tend to be prominent open source developers and personalities and folks like that. And so we can go to them and show off things that are not ready for the light of day, but because they’re already under NDA, like with GitHub, we’re able to show them, give them early access to things. And so we’ll try it out on them and get signal from them and make sure that they find it valuable, that they find that it does something that they want it to do. Then there’s the apex of this curve is the public technical preview that all projects need to get to that there will open up a wait list and we’ll enroll folks, anyone from the general public who wants to play around with this, and then we’ll conduct user research on them.

And so it’s not like how do we decide in the early phases, it’s maybe that’s where the decision making around direction happens. And in the later phases, it’s more evidence-based around what we’re hearing back. But in those early phases, I’d say part of it is, again, it’s hiring. It’s hiring people that have a lot of experience. Next is a very unusual team in the sense that everybody at Next is at the top end of the experience scale. Our most junior member is a senior, which is highly aberrant. There are no other teams at GitHub that I’ve ever seen that look this way, but I think that it’s fair and correct because we’re sticking on the backs of these individual contributors. We’re telling them, “Go advance the state of the art, don’t mess it up. Bye”. And we’re not giving them a whole lot of guidance. And so because they’re experienced, I trust them to execute.

But beyond that, I’m also asking them to bring their experience and their opinions with them to work. And then inside our house, in our private Slack channel, we’re arguing with one another. That’s a good idea, that’s a terrible idea. Or what if we tried it this way? And we have to have that trust and that candor internally in order to be able to really wrestle with these ideas. And so people are making these small spikes and these innovations can come from anyone on the team. It’s not like I put together some PowerPoint. I show it to the team and I say, “We’re going to do this”. It usually doesn’t look like that. It usually looks more like individuals on the team are putting forward miniature things that they’ve put together. Like yesterday, I played around with the models. I was trying to get them to do something and I did this, what do you think?

The power of demos and internal validation [21:59]

And then we’ll do demo day every Thursday where anybody from the team will do demos. And then I guess it feels a little bit like a jazz band. Everybody’s trying out things and suddenly somebody plays a really cool lick and then everybody’s just like, “Whoa, whoa, whoa. That was really good. Do that again”. And so the first signal that we’re looking for is our own excitement because sometimes that’s the only signal that we can possibly have when it’s novel things. There is no prior art to compare it to. So I don’t know, that’s messy. But I think that that’s the reality on the ground.

Eddie Aftandilian: There’s something very magical that happens when someone shows a demo and everyone else gets excited. And as Idan said, Next is pretty undirected. Especially in this exploration phase, people can wander around and work on what they want to work on. And when you start to see people sort of gravitating towards someone else’s demo or a project, I think we both take that as a super positive sign that there’s something there. These people sort of voting with their feet for what they want to work on and what they want to invest their time in.

Idan Gazit: Yes, the first customer we have to persuade is ourselves. If we’re excited about it, then that doesn’t guarantee that it’s going to be a thing, but it’s a healthy early signal. And I can’t underscore enough the currency of demos at Next demos are everything. It’s like nobody is going to sell anybody else on an ocean just with a deck. You have to actually make something, show something, let other people use it, even if it’s messy and terrible and it doesn’t matter. We’re all comfortable with messy and terrible, but that’s the genesis of great ideas. It always starts out as something messy and terrible.

Shane Hastie: So in that creative space, how are you using generative AI tools?

The benefits of using AI tools [23:50]

Idan Gazit: I feel like I spend most of my time interacting with the APIs of through building things to try out, can the model do something or using our own tools. I can’t say that I’m a super intense user of every generative AI tool out there like everybody else. I use quite a bit of the straight-up ChatGPT or Claude or Gemini. And also in the context of my job, I feel like I need to stay on top of what are the models good at, which is a horse race, right? Today, this model seems to be the best at producing code, but that one seems to be good at, I don’t know, Gemini for example, has famously sort of the longest context window of all of the models. But if you fill up that context window to the brim, does it actually behave well or does it start to forget later in the context window?

Well, that’s something that you can’t read on a spec sheet. You have to get a feel for it by working with it. So I feel like most of my use of generative AI ends up being these kinds of testing out the vibes of the models to see what they’re good at or what they appear to be strong at, or are they chatty? I remember there was one of the releases of OpenAI’s models, we were like, “Wow, this version is really chatty. It seems to really want to add a lot of comments to code”, and we have to actually alter the prompts that we give it to tell it to, “Shut up with the comments, please”.

So things like that I feel like end up being my primary uses for AI. But otherwise, yes, chat like everybody else, LLM is informational alchemist converting from any format to any other format. Take small text and make it big. Take big texts and make it small. Take a diagram, make a text, or the other way around. Those are the natural applications of AI to my mind. And so I spent a lot of time thinking about that.

Eddie Aftandilian: I think for me, the most interesting use that I have that is fairly new is this spoke tutor use of chat models. Let’s say I have a research paper or something and I want to deeply understand it. I can drop that into ChatGPT or Claude and then start asking it pretty difficult questions about the paper. And it’s pretty good at explaining stuff to be. And I find this especially helpful for when I’m going out of my domain. So like I mentioned, my background is in programming languages. Maybe I want to understand deeply something from a machine learning paper. I find the models really helpful for understanding stuff like that. And then the other thing that’s been true actually for a long time is when you’re programming in a new framework or a new programming language, the models are super helpful for that. When I joined GitHub, it’s really a Java developer.

I hadn’t written TypeScript. I’d written a little Python, but not as a work job level of proficiency in Python. And I come here and most of our projects are in TypeScript or Python, and I found even very early copilot super helpful for teaching me basic syntax, teaching me idioms of the language. So what’s the idiomatic way to do this? I know it’s one way to do it, but if I write it that way, it’s going to look like I’m writing. I’ve just transcribed Java into Python, which I know is not what a Python programmer would do. So I find for a long time it found them super useful for this like, “I’m well grounded in this language or this framework, now help me translate the concepts from that language into this new framework or language that I’m trying to work in”.

Idan Gazit: Yes, big plus went to the new languages, new frameworks thing. I like making weird keyboards and I like it enough that I fell down the rabbit hole and okay, none of the existing firmwares for the microcontrollers for custom keyboards are exactly what I want. I know I’m going to write my own one, right? This is perennial sort of side project that is still going on two years later. But I was like, “But I don’t want to write C. It’s been years since I’ve written C. I want to do what the cool kids are doing. And the cool kids are doing a lot of Rust”. And it turns out that you can write Rust for bare metal nowadays. And without Copilot, man, it would’ve would’ve been really hard. But with this suggestion tool in my pocket, even if I don’t use the suggestion as it is verbatim, just having it shine a flashlight or show off idioms or tell me this is how I would use this library or things like that is incredibly invaluable for me as a working developer 100%.

Shane Hastie: Gents, we’ve covered a lot of ground and it’s getting really, really interesting. If people want to continue the conversation, where can they find you?

Idan Gazit: The Next Discord if you go to gh.io/next-discord, that’s definitely sort of our community of early adopters. People who love to injure themselves on the sharp edges of whatever it is that we’re making. We love it both as a community of folks who are really interested in that serve a community of futurists, a community of folks who are interested in tools for thought, because at the end of the day, software development is thought made real, and folks that we tend to come to first when we want feedback, when we want to field something. And so that’s probably the best way. We’re also githubnext.com we try to work as much as possible in the open so folks can see all of our past projects and write-ups there. And of course, we have a Twitter and a Blue Sky account. They’re all linked from the githubnext.com website. So yes, I think those are the good spots to meet us.

Shane Hastie: Well, thank you so much for taking the time to talk to us today.

Idan Gazit: Thank you. We really appreciate you for having us on.

Eddie Aftandilian: This was fun.

Mentioned:

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

GitHub Next: how their research and prototyping team operates

Transcript

Introductions [00:58]

Leading an innovation team [06:18]

Hiring people with the right fit [08:52]

Examples of successes and failures [13:05]

Decision making and bet selection [17:29]

The power of demos and internal validation [21:59]

The benefits of using AI tools [23:50]

Leave a Reply Cancel reply

Stay Connected

Latest News

AT&T customer gets $10K roaming bill for one month

James Gunn's 'Superman': Are There Post-Credits Scenes?

FBI Seizes Sites That Offered Pirated Nintendo, PlayStation Games

The newest Nest Learning Thermostat is on sale for Prime Day.

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Transcript

Introductions [00:58]

Leading an innovation team [06:18]

Hiring people with the right fit [08:52]

Examples of successes and failures [13:05]

Decision making and bet selection [17:29]

The power of demos and internal validation [21:59]

The benefits of using AI tools [23:50]

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News