Data Council 2025: The Data Science & Algorithms Track with Sean Taylor and Jesse Robbins
- Sean TaylorData Science
OpenAI - Jesse RobbinsGeneral Partner
Heavybit
Heavybit is thrilled to be sponsoring Data Council 2025, and we invite you to join us in Oakland from Apr 22-24 to experience 3 days of cutting-edge technical talks from the brightest minds in AI & data, learn more at datacouncil.ai and use the code HEAVYBIT20 for 20% off.
Defining a Clear Vision for the Data Science Track
Jesse Robbins: Hey, this is Jesse Robbins. I'm one of the general partners here at Heavybit. We are extremely excited to be sponsoring this year's Data Council.
And to kick it off, we are interviewing the track hosts, including Sean Taylor, who just joined from OpenAI. Welcome, Sean.
Sean Taylor: Yeah, thanks Jesse.
Jesse: So Sean, you are hosting the data science and algorithms track.
I hear that you have some strong opinions, which is how you're putting that together.
You want to tell me about those and what the track's all about?
Sean: Absolutely, yeah. I mean I think it's important to distinguish your track and have like a really strong opinion about what it should be about.
I wasn't just kind of looking for the best talks, I was looking for the best talks about what I think data science really is meant to be and how it's going to unlock value for the companies where we operate.
And so for me, the cornerstone to that is experimentation and causal inference. Those are topics that are kind of important in statistics and in machine learning, but we apply them in companies to make better decisions on a day-to-day basis. So I really wanted the speakers that we had to reflect the kind of the day-to-day work of someone who thinks that way and thinks that that's the way that we have impact on companies that we work at.
Jesse: One of the things that I know that you're highlighting is frameworks and methodologies that are actually being applied in modern research and data science environments.
You know, a really strong list of confirmed speakers. Can you tell me about who you're excited about and what they're going to be talking about?
Sean: Well, I'm excited about them all equally. They're all amazing people and I really feel very lucky to have such a great track.
We get a lot of submissions, but also I kind of go and seek out speakers to make sure that we have the set of topics that I want to portray there.
But these speakers are, I think, probably the best that I've had in all the years that I've been hosting. In particular, we have, Hadley Wickham is probably the star of the show.
He is somebody that I was a long time fan, like, maybe 20 years ago when I was starting my career, he was somebody that I looked up to and he's only gone on to do bigger and better things since then.
And he's built like, the majority of the tools that people use in practice. Like, he's had his his hand on on most of them.
So, I kind of invited him to come and speak about whatever he was recently been thinking a lot about.
And so, it's kind of exciting to kind of, you get a peek into the future from someone like him who's been thinking really deeply about how generative AI can be integrated into data science workflows.
Deep Focus on Experimentation and Real-World Applications
Jesse: Wow, that's awesome. Who else? Tell us about the other speakers in the track.
Sean: Absolutely. Yeah.
I think one of the kind of hallmarks of the data science track is that we talk a lot about experimentation. And having people who work on experimentation in big companies but also at startups has been sort of one of the themes.
So, we got Timothy Chan, he's head of data at Statsig. Statsig for people who don't know, is one of the most popular A/B testing tools on the market.
You know, used by hundreds, probably thousands of startups and big companies at this point. I'm a big fan of the tool.
He's been working on experimentation for as long as anybody and he gets to talk to people who are running experiments at all these companies all the time.
Jesse: At a massive scale?
Sean: And a massive scale. So, that's a really unique opportunity to get insight from somebody who's not going to have the same perspective as somebody who's just been at the same company for a long time or even kind of like, you know, deep academic expertise.
He's somebody who's really hands-on with all the customers of their platform. So, very excited for Timothy Chan.
Also, Joe Powers is a principal data scientist at Intuit, who will be speaking, he's really exciting, I think for the alternative perspective is having somebody who's deeply thought about one specific problem for a long time.
And so, he's worked at Intuit for a number of years now and he's going to be talking about Bayesian A/B testing and how they apply it there.
So, getting that peak in under the hood of how a big company that you know, can be a little mysterious how a company with like tons of resources and really hard problems, approaches solving them.
So, getting that insight from somebody who's been there for a long time and kind of like one dedicated use case will think, I think, will be a really nice compliment to Timothy's talk.
Jesse: One of the things that I loved about the conference last year was seeing really established practitioners, working really closely with people that are figuring out both what they're trying to do inside of their organizations and also how to justify or explain or even market internally to business users who might need to use or apply these technologies.
Can you talk about the Office Hours program and sort of the other elements in the conference that kind of really formalize the ability to connect people that know what they're doing with people that are trying to figure it out and make the case?
Creating Connections Through Office Hours and Community
Sean: Yeah, absolutely.
I think, you know, one of the hard things about work in like 2025 is that we don't get to peek over each other's shoulders, kind of see what people are doing in practice. So, that's kind of the role of a conference like this, is to like, get to tap on someone's shoulder, see what they're doing, have them share what they're working on and have that kind of deep interaction.
And so, one of the unique qualities of Data Council is that you get unique access to the speakers because we have office hours after the track, after the talk is over.
And then, in that Office Hours you can really spend dedicated time with that person. It's not just like a couple minutes after the talk, and it's a very social conference in general.
I think that the hallways are really great and you end up getting to spend a lot of time with everybody.
And that's what we're all looking for is kind of like validation of what are we working on? Is what we're working on kind of normal?
Can we learn from other people what they're doing and like figure out a way to improve what we're doing?
Data Council is kind of like one of the few places that we can do that these days.
Jesse: Last year, Roger Magoulas and I posted about how data science was having a DevOps movement.
A moment where community was starting to come together and formalize in a way that frankly we hadn't seen a lot, I think in part because what you just said.
Where you don't get to peek over each other's shoulders. I find that data science teams are in some ways more isolated than others, which is part of why Data Council is so important.
That's why Heavybit sponsors it. It' why we participate so heavily, why we are hosting a party.
It seems to be that this is a really unique event and opportunity for people that frankly was just needed sort of that ability to kind of reconnect and start working together again.
For you, what are your favorite parts about Data Council?
Sean: Yeah, I mean for me I feel like it's a reunion every time. So I really, I have specific people that I want to catch up with.
And so, you know, being in my track is really wonderful, but there's also lots of other people at the conference, so it being over multiple days and being able to space things out is really nice and there is a lot of downtime baked into it.
But I think for me, like the, you know, there's different concentric circles of community that you have for your profession.
You have community within your own company and you have the kind of like broader community and getting to be in person with people for this dedicated amount of time is really special and unique.
And so like, you know, carving out time to hang out in the hallway and just catch up and see what people are working on.
And that's usually what I ask people is kind of like, what challenging thing are you working on these days? Or like, you know, what's keeping you up at night?
And then, usually get a really interesting answer to that.
Jesse: When you selected speakers, how did that work? What was your process?
What was the analytics stack that you ran in order to find the right folks given the extraordinary quality of the people that you're bringing in?
Sean: It's a tough process because we do have open submissions for talks and we get a lot of great submissions, so it's really tough number one, to go through that and say no to a lot of talks 'cause there's a lot of people that probably could be speaking that we don't select.
We get so many great speakers. But also I want to compliment that with some people from my network that I think kind of fill in the gaps from what we got in terms of submissions.
And I tried really hard to like, get opinions from the community about what should be in the data science track.
So, I went on LinkedIn and I posted, "What should be in the data science track in 2025?" to try to like just solicit people's opinions.
And it was like crickets. Like, I couldn't get anybody to form a strong opinion about what is data science now after we have people forking off to be machine learning engineer.
There's all these disciplines that I think, I would say are kind of like forks of the original data science discipline.
So, what's left for us? So, I guess I just formed my own opinions about that and it's kind of what I think I see as most effective in practice at companies, which is people doing kind of like, deep analytical and statistical work.
People building tools that are useful to practitioners internally to the company. People practicing experimentation and causal inference that help improve decision making. When I can find people who can speak to that from some really recent and useful experience with real examples, I feel like that that's what people are and the track are going to enjoy and learn the most from.
Jesse: It's amazing. You know, one of my questions is like, what are the practical skills and insights that you want people to have? And you just described that to a T.
So, I'll ask you another one, which is, how much has AI changed the landscape in the last year since the last conference?
What is suddenly even more different now, particularly given that you just joined OpenAI?
AI’s Evolving Role in Data Science
Sean: So, two years ago I think was the original like, AI moment and I remember being at Data Council and the most packed talk was this talk about generating SQL from natural language.
I was like, okay, no one's going to be writing SQL ever again. And you know, spoiler alert. I write SQL every day.
So, I think that was kind of an interesting, kind of like evolution of how we thought about.
At first we were like, wow, it's coming for our jobs and there'll be nothing left for us to do 'cause all the queries will be written and it'll do all the analysis for us.
Jesse: The AI also doesn't want to write SQLs it turns out, yeah.
Sean: But now, having kind of like seen how things have played out that the AI systems are generating even more interesting data and generating more questions for us to answer for the the businesses that we work in.
I think that that's really my big insight is that experimentation is even more important today than it was a few years ago because it's one of the few things that gives us like real insight into how these systems are behaving in practice. So, that kind of rigorous empiricism is now more valuable than it was before AI existed.
Jesse: Yeah, you can synthesize and collate information all you want, but if you run experiments, that's literally the only way you learn things.
Sean: Right.
Jesse: So, we have to do new stuff now.
Sean: Yes, exactly.
Jesse: Sean, before OpenAI, you were at Lyft and Facebook and you have had a variety of roles from research scientists to being a manager and now back to like core startup work where you are at a core data science role 'cause OpenAI is small, right?
Sean: Right.
Jesse: One of the things that I know is a priority for the conference is figuring out how to get people who are earlier in their career, who maybe haven't had foundational data science roles at the biggest companies that use the most data.
Is this track that you're running and the conference generally something that's accessible only to kind of people at your level who you're already connect with on LinkedIn?
Or what are you doing to make it more accessible for people earlier in their careers?
Making the Track Accessible for All Career Levels
Sean: Yeah, that's a great question and I'm very sensitive to that and I try to make sure that we have a track that's open for everybody, interesting to everybody.
And I would expect if you're attending Data Council, no matter what your role, you'll find something interesting in the data science track.
But I do think I want to have a strong opinion about what's in there content-wise. But when I talk to all the speakers about what the talk should be like, we talk a lot about the level.
Like, how to kind of calibrate it to the right audience. Like, who's attending the talk, what are they going to want to see it?
Jesse: This isn't a poster session.
Sean: Yeah, and I think the speakers are, you know, they want to give a good talk and they want to make sure that they cater to the right audience as well.
And that means kind of having something for everybody. It means having, like, touching on advanced things that do kind of pique interest to people and make them want to follow up.
But also kind of covering the basics and making sure that you explain things at a really high level and talk a lot about the why and motivate the work.
And so, I think it starts with yeah, that framing for the speaker is about like, what is the goal here, what is the aesthetic that we're trying to hit?
And also curating for speakers that I know are likely to give good talks. Like they, you know, that they have a track record of giving talks that are going to be accessible to a broad audience.
Jesse: Yep. When I was putting together my own conference for the DevOps community, the big challenge was always like, you want to be memorable but for the right reasons.
And that means that you need speakers who are both experts in their field and entertaining and passionate, and are going to follow up with more great content and great conversations.
And I think that Data Council this year is really doing a fantastic job of increasing who's going to be able to come.
And I just think there's so much more interest now that people seem to care in a new way about both the value of data and experimentation and rigorous analytics.
Is there anything that you're glad you didn't include this time around? Like, what's on your no list?
Sean: That's a great question. You know, we had a really good talk from a couple years ago.
I also have previously worked on forecasting. So, I would lump that into data science work and I think that's still kind of uniquely something that data scientists tend to focus on.
But we don't have a forecasting talk this year. We had a really good one a couple years ago from the folks at Nixtla, which is a startup that works on forecasting packages.
I don't know if there's really anything left. Like, I just feel like maybe the interest in that has kind of declined a little bit or maybe a little bit of it is moving it over to Gen AI.
There are some new methodologies that are like, you know, pure AI approaches to forecasting. But that's one thing that kind of is cutting room for this-
Jesse: Who could have forecasted this outcome?
Sean: Exactly.
Jesse: So, right now we're in a really interesting and I think challenging time in certain industries around like proving the value of data science and analytics.
Can you talk a little bit about sort of how you see people making the case for what they do?
Is that part of your track design is really sort of highlighting like, this is how you talk about the value of this type of work and how you communicate that to other people in both business and other contexts.
Demonstrating the Business Value of Data Work
Sean: Yeah, I think it's a really challenging discipline and field to be a part of from that perspective because when you make a good decision, people really kind of take that for granted based on data.
So, there's a lot of like kind of, I mean, Duncan Watts has this book called "Everything Is Obvious: Once You Know the Answer."
And I think that that's really a common theme with data work is that the value is so internalized so quickly 'cause it's just kind of like a perceptual system for people.
They learn some fact about the world that they didn't know and then they kind of assume that they always knew it. And that's really tough for us to overcome.
I do think people do spend a lot of time trying to justify the value of this practice, even though it doesn't directly like, turn into engineering output or things like that.
But we have to tell stories. And I think that that's kind of like one of the things that we do with data is how do we tell a story both of what should be done, but also how the work that we did was beneficial. And that's kind of like what we do at conferences like this.
'Cause a lot of these are case studies where people are going to motivate the work that they did and why they did it, and why it was valuable for them or the company that they worked at.
So, I hope that there are some lessons there in the track. But for me personally, one of the things that I like the most about data science is finding the weird things, like finding bugs.
Facebook used to have this thing every week at All Hands called "Fix of the Week." Maybe they still do it, I don't know.
And it was always like somebody who looking at data, found some problem that they then fixed and generated more impact than most of the product teams could hope to in a year.
And I think that that's kind of the story to me is that like, if you don't look, then you don't know that there are problems. And I think data's one of the only ways that we have of knowing that there are problems.
Jesse: Speaking of that, are we going to get any hot takes on Airbnb and the no A/B testing conflict that they've managed to ignite?
Sean: Yeah, I mean, I can kind of see the perspective that you can't, that there's the like, you can't A/B test your way from a horse to a car and I'm very sympathetic to like, if your strategy's wrong, A/B testing is not going to help you.
So, I can see the argument. But I think almost all tools are kind of like, in the right hands can be very effective, and then in the wrong hands can be quite detrimental to what you're trying to achieve.
Jesse: Sean, what are the things that you're hoping personally to get out of the conference other than obviously the, having everyone say this was the best track?
Sean: I think to me, I still feel like I'm at the beginning of my career as somebody who works on AI.
I wasn't trained to work on AI, but I now work at Open AI and I'm trying to learn as quickly as I can how to be effective in that space.
So, in our track we have a couple talks. We have Hadley Wickham is talking about applying Generative AI to data science problems.
And we have Bryan Bischof who is Head of Data science and ML Engineering at Hex who's worked on--You know, Hex Magic is partly his creation.
And these are the kind of practitioners I want to learn from. Want to see what people are doing and see what people are thinking because that space is evolving really rapidly.
And how do we as data scientists evaluate and evolve and improve AI products is something that's kind of like my current focus.
So, I want to make sure I learn as much from the practitioners at the conference as I can.
Jesse: That is awesome. Well, I think that it's going to be really amazing. I am looking forward to these talks. I am really looking forward to the event.
We Heavybit are hosting our own party on the 23rd. So, hopefully you'll come there along with everyone from your tracks.
And once again, Data Council is coming up in Oakland, April 22nd to the 24th and tickets are available at datacouncil.ai.
Sean, thank you so much for joining me and can't wait to see you there.
Sean: Yeah, thank you.
Grab your Data Council 2025 Tickets!
Tickets are going fast, don't miss out on an incredible event - get your ticket today, and use the code HEAVYBIT20 for 20% off.
Content from the Library
Generationship Ep. #32, Structuring Data with Marcel Kornacker
In episode 32 of Generationship, Rachel speaks with Marcel Kornacker, creator of Pixeltable and a pioneer in database technology....
O11ycast Ep. #74, The Universal Language of Telemetry with Liudmila Molkova
In episode 74 of o11ycast, Liudmila Molkova unpacks the importance of semantic conventions in telemetry. The discussion...
Open Source Ready Ep. #1, Introducing Open Source Ready
In this inaugural episode of Open Source Ready, Brian Douglas and John McBride embark on a technical and philosophical...