Summiz Post

What is an AI Engineer with Shawn Wang (a.k.a Swyx) of @LatentSpaceTV

Humanloop

Thumbnail image for What is an AI Engineer with Shawn Wang (a.k.a Swyx) of @LatentSpaceTV
Post

The podcast explores the rise of AI engineers, their unique skills for rapid deployment, and the competitive AI landscape shaping tech innovation.


You can also read:


The Rise of AI Engineers: A New Role in Applied AI

AI engineering is emerging as a distinct role, filling gaps left by traditional machine learning (ML) engineers. While ML engineers focus on optimizing models and working in the "one to n" phase, AI engineers are tasked with taking products from "zero to one," making models useful in real-world applications. This shift is driven by the rise of foundation models and open-source APIs, which have drastically reduced the time and resources needed to accomplish AI tasks. What once took years of research can now be done in a matter of hours using APIs.

AI engineers are closer to product development, often possessing front-end skills and working directly with product managers and domain experts. This collaboration is crucial, as no amount of engineering can fix a bad product decision. The role of AI engineers exists on a spectrum, from hardcore ML engineers to more product-oriented roles, with an API line separating the two. Companies are increasingly outsourcing ML engineering roles due to rising costs, while AI engineers are becoming essential for productizing AI.

The AI landscape is evolving rapidly, and those who specialize in AI engineering will outperform generalists. Speed is key in this field, with a "fire ready aim" approach being more effective than traditional methods. Shipping a V1 product quickly and iterating based on real-world data is often the best strategy. Vertical startups, which leverage proprietary data and target niche markets, are outperforming horizontal tools in the AI space.

We're observing a once-in-a-generation shift in applied AI, fueled by the emergent capabilities and the open-source API availability of foundation models. A wide range of AI tasks that used to take five years and a research team to accomplish in 2013 now just require API docs and a spare afternoon in 2023.

The Role of the AI Engineer and the Changing Landscape of AI

Being early in the AI space offers some advantage, but it doesn't guarantee success. The role of an AI engineer is to fill gaps that aren't traditionally part of an ML engineer's skill set. While ML engineers focus on scaling from "one to N," AI engineers are more involved in the "zero to one" phase, where rapid iteration and learning from the market are key. The old ML engineering process was more deliberate, but now, speed is crucial. Winning comes from moving quickly, shipping products, and iterating based on market feedback. As a result, the approach has shifted to "fire, ready, aim" rather than the traditional "ready, aim, fire."

In this fast-paced environment, foundation models are becoming increasingly important. There are three foundation models launching soon, a significant development that wasn't publicly known before. This shift is part of a broader trend that was predicted in the essay "Rise of the AI Engineer." The essay argued that a new type of engineering role would emerge due to the capabilities of LLMs and foundation models, similar to how DevOps, SRE, and data engineers emerged in the past. This new role, the AI engineer, is becoming essential as AI tasks that once required years of research can now be accomplished with just API documentation and a few hours of work.

A key concept in this new landscape is the "API line." There's a spectrum between data-constrained and product-constrained roles, and the API line used to be internal within companies. However, with the rise of foundation models, this line is increasingly between companies. As the cost of creating models rises and model labs become more closed, companies are outsourcing their ML work. This has led to the rise of specialist engineers on the product side of the API line, who are up to date on the rapidly evolving AI stack.

Keeping up with the AI stack is a full-time job. The stack is deepening every day, and those who specialize and stay updated will outperform generalists. The people who put in the effort to stay current with the latest developments in AI will have a significant advantage over those who try to be generalists in this rapidly changing field.

The Evolution of the AI Engineer Role

The concept of an AI engineer emerged from a need to bridge the gap between traditional software engineers and the highly specialized roles of data scientists or ML engineers. Companies were looking for engineers who could focus more on AI than the average software developer, but without requiring the deep research background typical of data scientists. Engineers, on the other hand, wanted to specialize in AI without needing a PhD or years of research experience. This led to the creation of the AI engineer role, which allows for a more accessible entry into the field of AI, especially with the rise of foundation models. These models have made traditional qualifications, like PhDs, less relevant, as tasks that once took years can now be accomplished in hours or days with the right tools.

This shift represents a generational and platform change, similar to what happened with SRE, DevOps, and data. The rise of foundation models has opened up new opportunities for engineers to specialize in AI without needing the deep math or research skills of an ML engineer. While having more ML skills can make an AI engineer more successful, it’s not strictly necessary to get started. In the early stages, an AI engineer might only need to know how to prompt and use APIs. However, as they advance and start working on production-level AI products, they will need to invest in more ML skills, such as building an operational stack, fine-tuning models, and handling inference and evaluation capabilities.

AI engineers also fill gaps that ML engineers traditionally don’t cover, particularly in areas like agents. While ML engineers have been involved in agent research, AI engineers, with their different assumptions and experiences, might be better suited for this work. This highlights a qualitative or anthropological difference between AI engineers and ML engineers. AI engineers tend to work on different types of problems and have a different mindset compared to ML engineers, who often focus on scaling solutions (one-to-end problems).

The rise of foundation models has also changed the perception of what’s difficult in AI. Tasks that once required a research team and years of work can now be accomplished in a matter of hours or days. This shift is humorously captured in an XKCD cartoon, where an engineer is asked to build an app that uses GPS satellites to provide an exact location in a national park, which he can do easily. But when asked to add a feature that identifies birds in photos, the engineer responds that it would take ten years and a research team. This cartoon illustrates how the perception of difficulty in AI tasks has historically been off, and how advancements in AI have made previously difficult tasks much easier.

In summary, the AI engineer role is a response to the changing landscape of AI, where traditional qualifications are no longer as important, and new opportunities are emerging for engineers to specialize in AI without needing a deep research background. The role is still evolving, and the skills required will depend on how deeply an engineer wants to engage with AI, but the rise of foundation models has made it easier than ever to get started.

AI Engineers vs ML Engineers: Different Roles, Different Mindsets

AI engineers and ML engineers operate in distinct phases of product development. AI engineers are involved in the "zero to one" phase, where they take a model and figure out how to turn it into a useful product. Their focus is on building something from scratch, often with front-end skills and a product-oriented mindset. In contrast, ML engineers work in the "one to n" phase, where they refine and optimize models, often dealing with more mathematically complex tasks. For them, the model is the center of their work, and they are more specialized in that area.

The difference in focus means that AI engineers and ML engineers have different skill sets and mindsets. AI engineers are more product-focused, while ML engineers are more model-focused. While it’s possible for someone to cross over between these roles, each has a "home base" where they are most comfortable. The kind of person who excels in one role may not necessarily succeed in the other.

In terms of team composition, AI engineers tend to outnumber ML engineers, especially in more mature teams. A typical ratio might be four AI engineers for every ML engineer. This is partly because AI engineers handle tasks that ML engineers might avoid, such as product-related work that isn’t directly tied to the model. Additionally, there simply aren’t enough ML engineers being trained to meet the demand, so companies supplement their teams with more AI engineers.

The Importance of Product Managers and Domain Experts

While AI engineers play a crucial role in building products, product managers and domain experts are just as important, if not more so. These roles provide the essential insights into customer needs and product direction, which no amount of engineering can replace. A bad product decision can’t be fixed by good engineering, so having someone who understands the customer and the market is key.

AI engineers are often better equipped than ML engineers to communicate with product managers and domain experts about what’s possible with the latest foundation models. This is because AI engineers are product thinkers first and foremost, and they are more in tune with the state-of-the-art in AI technology.

Collaboration Between Roles

The relationship between product managers, domain experts, and engineers is evolving. In the past, product managers were primarily responsible for defining the specs and customer problems, while engineers translated those specs into code. However, with the rise of AI, product managers and subject matter experts are becoming more directly involved in the creation process. They are now writing prompts and creating artifacts that are part of the product itself, working more closely with AI engineers than they ever did with ML engineers.

This shift in collaboration is a natural progression as AI technology becomes more integrated into product development. It allows for a more seamless interaction between the people who understand the customer and the people who understand the technology, ultimately leading to better products.

The Emergence and Debate Around the AI Engineer Role

The AI engineer role has emerged recently, but not without its critics. One common argument is that every software engineer should be an AI engineer, making the new title unnecessary. This perspective, often held by those who prefer generalization over specialization, suggests that AI should be part of every engineer's skill set. However, this hasn't been the reality. Many engineers remain skeptical about AI, with around 50% of the Hacker News community still not adopting tools like GitHub Copilot. The future of AI is here, but it's unevenly distributed—some people take it more seriously than others, and that's fine. Not everyone needs to work on AI; there are other important areas like distributed systems, front-end development, and databases.

Despite its growing presence, the AI engineer role is likely to remain low-status for a while, especially compared to machine learning engineers and research scientists. The bar for entry is low, and people will continue to question the role's existence. This is part of the natural process of defining new roles and boundaries in the tech world.

The primary goal of defining the AI engineer role is to create a "shelling point" where people can find each other in the job market. The role is still evolving, and there isn't yet a clear skill ladder, like senior or staff AI engineer. However, some companies, like MasterCard and OpenAI, have already adopted the title. It's particularly interesting to see a company like MasterCard, a clear incumbent, embrace the role of VP of AI engineering.

OpenAI has its own interpretation of the AI engineer role, requiring 5-7 years of machine learning engineering experience for a position focused on fine-tuning models. This requirement is debatable, as not all AI engineering roles need that level of experience. OpenAI leans more towards the machine learning engineer spectrum of AI engineering, which is fine, but it highlights the spectrum and fuzzy nature of the role.

The AI engineer role exists on a spectrum, and its fuzzy nature makes some people uncomfortable. There's also discomfort with the hype surrounding the term "AI." Some propose alternative titles like "LM engineer" or "cognitive engineer," but "AI engineer" is the most natural and widely accepted term.

The Legitimacy of the AI Engineer Title and the Importance of Being Early

The AI Engineer title is still questioned by some, with people on platforms like Hacker News finding it annoying or unnecessary. However, the demand and supply of work in this field are undeniable. The role may feel "illegitimate" now, but it will feel less so with each passing year. The work involved is substantial, and the field is evolving rapidly. The organizer's entire body of work—his conference, podcast, and newsletter—aims to define and legitimize this role.

Being early in the AI engineering field offers some advantages, but it doesn't guarantee success. While being early allows you to keep up with the vast amount of papers, techniques, and APIs, it doesn't mean you'll automatically win. The later you join, the more you'll have to catch up on, but it's not impossible. It's just harder. The example of Auto-GPT illustrates this point: they were early, gained a lot of attention, but ultimately flamed out. Being early doesn't protect you from failure.

The hype surrounding AI products can be both a blessing and a curse. While hype can attract a lot of interest and investment, it can also be fleeting. If a product doesn't deliver, people will move on just as quickly as they came. The key is to focus on deep, sustainable, and hard problems. Hype will come and go, but long-term success requires solving real, lasting issues.

The AI Engineer World Fair: A Multi-Track Event

The AI Engineer World Fair is the organizer's highest-stakes expression of his vision for what AI engineering should be. The event has evolved from a single-track conference to a multi-track one, driven by the sheer amount of interest and the clear swim lanes of work people are doing. Last year, the event had one track, but this year, it has expanded to include multiple tracks, such as multimodality, evals and ops, and agents.

One of the new tracks this year is "AI in the Fortune 500," which was added to address the criticism that AI is only for startups. The organizer felt that last year's event was too focused on startups, which can lead to a "navel-gazing" Silicon Valley bubble where companies talk big but don't generate much revenue. In contrast, the Fortune 500 track brings in speakers from household names like Coinbase and Salesforce. The organizer believes that in a capitalist society, "revenue is the only thing that keeps you honest," and he's proud to have initiated this track.

For the first time, the event also includes a "VPs of AI" track, aimed at higher-level discussions within the field. The goal of the event is not just to share knowledge but to foster many-to-many connections, allowing people to meet each other without being limited by the organizer's involvement.

Moving Fast and Networking in AI Conferences

Conferences are often seen as a place to attend talks, but the real value lies in the conversations and networking that happen outside the sessions. The ultimate goal is to have such great conversations that the talks become secondary. These events are expensive, and people need to justify the cost by making them work-relevant—whether it's finding jobs, hiring, or launching products. The speaker is clear: "The ultimate win condition... you just show up and... my conversations here are so great like I don't need anything else." However, the cost of attending means that conferences must also offer tangible benefits, like helping people level up in their careers or launch new products.

On the topic of product launches, the speaker reveals that three foundation models are set to launch, though this hasn't been publicly announced before. While they aren't at the level where companies give them their biggest launches, they are becoming more legitimate with each iteration of the event. "We have three Foundation models launching... I never said that publicly but right you heard it here first," they say, hinting at the growing importance of their platform.

The event itself has seen significant growth. Last year, they had 500 in-person attendees and 20,000 online, with 150,000 asynchronous viewers for one of the top talks. This year, they are aiming for 2,000 in-person attendees and scaling everything up by four times. "Last year our in person was 500 and the online audience was 20,000... this year we're just basally trying to make everything four times larger," the speaker notes, underscoring the rapid expansion.

When it comes to advice for AI engineers and product leaders, the speaker emphasizes the need to move fast. Traditional development timelines are too slow for AI products, and companies should aim to reduce the time it takes to shape a product from months to weeks. The idea is to adopt a "fire Ready Aim" approach, where you ship quickly, gather feedback, and iterate based on real-world data. "You should move fast... if you're taking like three months to shape a thing what could you do to make it three weeks?" they ask, challenging the conventional wisdom of slow, deliberate development.

The speaker also draws an interesting analogy between gradient descent in machine learning and product development. Instead of optimizing model weights, companies should optimize their products by gathering feedback from the market and iterating quickly. Shipping a V1 that is "good enough" allows companies to gather valuable data that can be used to improve the product. "The gradient descent is performed in the marketplace of the user rather than in a model weights," they explain, highlighting the importance of real-world feedback in product development.

However, moving fast doesn't mean ignoring risks. Companies need to be mindful of issues like hallucinations in AI models, but these risks are often overstated. Most companies can mitigate them by setting the right expectations with users. "Most companies you are not Google... go ahead and experiment like get out of your own head go try things," the speaker advises, encouraging companies to be bold in their experimentation.

Finally, the speaker hints at a "spicier" take that they have been sitting on for a while, but they don't reveal it in this segment. The non-spicy take is that everyone should be moving faster, regardless of whether they are working in AI or not. "Everyone should be moving faster and regardless of AI or not... that's a universal truth," they conclude, leaving the audience curious about the spicier take yet to come.

Vertical Startups Outperforming Horizontal Ones

Vertical startups are proving to be more successful than horizontal ones. These companies focus on specific industries or problems, often leveraging proprietary data and targeting high-margin markets. They cater to non-technical audiences who have pressing pain points that no one else is addressing. This gives them a unique advantage, as they can offer AI solutions that are highly valued by their customers. In contrast, horizontal startups, which try to serve a broad range of industries, face more competition and are more likely to fail. The market is brutal, and many horizontal players will likely be shaken out.

The key to vertical startups' success lies in their deep understanding of their customers' needs. They are able to solve specific problems that others overlook, and this allows them to grow faster and make more money. For example, a company working in AI and construction is thriving because it is willing to tackle real-world problems that most software engineers avoid. While many developers prefer to build tools for other developers, those who are willing to engage with the messy, real-world challenges are rewarded accordingly.

Examples of Successful Vertical Startups

Several vertical AI startups are already making waves. Harvey, for instance, is well-known in the legal space, while MidJourney has become a major player in the creative market, generating between $200 to $300 million a year with a small team. Perplexity is another example, positioning itself as an anti-Google, and while its profitability is still an open question, it has certainly made a dent in public perception.

Other vertical startups include Photo AI, which focuses on virtual staging for real estate agents, and developer tools like Cursor and Co-Pilot, which are becoming essential in their respective fields. These companies are succeeding because they are solving specific problems in their industries, and their customers are raving about their products.

The Challenges of Horizontal Startups

Horizontal startups, particularly in the developer tools space, face a much tougher battle. The market is crowded, and companies need to meet a long list of requirements to succeed. They must be scalable, open-source, and meet various security certifications, among other things. This makes it much harder to stand out and win customers. In contrast, vertical startups can focus on solving a single pain point and still achieve significant success.

For example, a vertical product that solves a specific problem can quickly gain a loyal customer base, even if the customers don't fully understand the technology behind it. This is in stark contrast to horizontal startups, where everything needs to be perfect just to compete.

AI in Specific Domains

AI is being applied in a wide range of verticals, from legal and creative industries to real estate and finance. Brightwave, for example, is using AI to help hedge funds conduct research more quickly, a task that is often commoditized but can be greatly improved with language models. While there are concerns about AI hallucinations, these are no different from the mistakes human analysts make.

There are also opportunities in medical AI, though specific examples weren't mentioned. Summarization is another vertical that has yet to be fully explored. While many companies treat summarization as a feature, there is potential to build it into a full-fledged product.

Advice for Buyers of AI Tools

For those looking to buy AI tools, the best approach is to start by purchasing existing solutions. This allows companies to understand their own needs before deciding whether to build something in-house. Building everything from scratch can slow down progress, and it's important to avoid the "not invented here" syndrome.

Some tools, like evaluation platforms and observability for APIs, are essential from the start. These tools help ensure that AI systems are functioning properly and can be monitored effectively. While some vendors may charge a lot for these services, they are crucial for any AI-driven business.

AI Tools, Employees, and the Four Wars

In the fast-moving world of AI, it makes sense to buy tools that help you move faster, even if they don’t perfectly fit your needs. The idea is to explore existing solutions first, and only build your own once you’ve fully understood the problem. You’re not the only one facing issues like key rotation or tracking inputs and outputs for fine-tuning. Everyone has the same problems, so it’s smarter to buy tools and share the development cost with others. If you find that none of the solutions work for you, or you’re overpaying, you can always build your own later. But initially, just buy what you need to move quickly.

There’s a distinction between AI product tooling, which is for building products for your customers, and internal productivity tooling, which is adopted much faster. Developer tools like Copilot, Cursor, and Sourcegraph are already widely used and form the baseline for internal productivity. Beyond that, there’s a growing stack of tools like meeting summarizers, but the real game-changer will be AI employees. These virtual workers can perform tasks that would normally be assigned to humans, and they can work all night without rest. While this concept is still in its early stages, it’s clear that the companies that can leverage AI to handle these tasks will have a significant advantage. AI employees will start small but grow over time, and those who adopt them will be the most capital-efficient.

However, it’s important to remember that AI and humans are fundamentally different. AI is already superhuman in some areas, like memory and retrieval, but much weaker in others. There won’t be a moment when AI reaches "human level" because it’s already better than humans in some dimensions and worse in others. The way we interact with AI employees will be very different from how we interact with human employees, and we’ll find that there are some tasks we would never give to a human, and vice versa.

In terms of trends, there’s a lot of noise in the AI space, with new papers and developments being released every day. But there are a few key battlegrounds that every serious AI company will need to focus on. These include the fight over data, GPUs, general-purpose models vs. domain-specific models, and regulation. These are the "four wars" that will determine the future of AI, and not all participants can win.

Interestingly, open source is not a battleground in the technical domain because most people in tech are pro-open source. However, it is a political issue. On the other hand, code generation, while an important problem, is not a battleground yet because everyone is still trying to figure out the next step.

Navigating the research space in AI requires a strong filter to separate meaningful work from the noise. Social media, especially Twitter, is flooded with influencers hyping every new paper as revolutionary. For example, a recent claim suggested that matrix multiplication was no longer necessary, leading to exaggerated conclusions like "Nvidia is going to zero." Without a filter, it's easy to get swept up in these overblown reactions.

In terms of research priorities, a ranked list of directions has emerged:

  1. Long inference
  2. Synthetic data
  3. Alternative architectures
  4. Mixture of experts and merging models
  5. Online learning systems (OLS)

These are the key trends shaping the future of AI research.

A significant trend in AI is the "Moore's Law of AI," where the cost of inference is dropping rapidly, much like the cost of semiconductors. The cost of running a 70 MMLU model decreases by 5-10x every year. This trend allows companies to build products that may lose money today but will become profitable as costs continue to drop. The idea is that Moore's Law will eventually "bail you out" of any bad inference ideas.

MMLU, or Massive Multitask Language Understanding, is a benchmark created by Dan Hendricks. It tests models on professional exams across various domains and is currently the primary metric for comparing large language models (LLMs). However, benchmarks like MMLU are temporary and will likely be replaced in a few years as the field evolves.

The cost of achieving the same level of AI intelligence is also dropping. For example, in 2022, running a GPT-4 level model cost around $20 per million tokens. Today, that cost has dropped to $2, and it will likely continue to decrease to $0.50 or even $0.25. However, as costs drop, the bar for what is considered acceptable AI intelligence will rise with the release of new models like GPT-5. This follows a classic cost curve model, similar to what is seen in semiconductors.

Another key trend is the commodification of intelligence. As the cost of inference drops, intelligence becomes more accessible. Inference speed is also increasing, with models moving from 70-100 tokens per second to potentially 5,000 tokens per second. Every 10x improvement in speed unlocks new product possibilities. For example, generating 5,000 tokens per second would allow for the creation of a full essay in just one second, making streaming tokens unnecessary.

Context length is also expanding dramatically. Models used to have a context length of 4,000 tokens, which was considered sufficient. Now, models can handle up to a million tokens, opening up new use cases, though it's not yet clear what all of them will be.

Finally, the trend towards multimodal models is becoming increasingly clear. Models like GPT-4 are now capable of handling multiple input and output modalities, and this is becoming a standard expectation in the field. The phrase "all modalities in, all modalities out" captures this shift towards multimodal everything.

Variance, Creativity, and New Knowledge in AI

Variance is an emerging trend in AI that could become more permanent over time. Many current AI use cases are what can be called "temperature zero" use cases, where the model is locked down to be deterministic and predictable. This is often seen as the "most boring possible use case." The idea is to make the model retrieval-oriented, ensuring it generates only what is expected. But what if, instead of chaining the model down, you let it loose? What if hallucination was a feature, not a bug?

This is where "temperature two use cases" come in. These use cases embrace hallucination and creativity, allowing the model to think of things you never thought of. Creativity is expensive, and if AI can provide it for a few dollars, it’s worth considering. In this context, hallucination becomes a valuable tool, not something to be avoided.

But creativity isn’t just about generating new ideas; it’s also about creating new knowledge. The combination of a model that acts as a "conjecture machine"—one that can propose possible explanations—and a system that can test and measure those conjectures could lead to AI systems that generate entirely new knowledge. For this to work, the system needs to operate in a "high temperature, non-deterministic mode," allowing it to explore possibilities beyond the predictable.

For those interested in exploring these ideas further, the AI Engineer conference is a great place to start. The domain for the event is "ai.engineer," and a discount code "Agency" is available for last-minute tickets. Shawn Wang, who is active on Twitter under the handle "swix," also hosts the "Latent Space" podcast, where more conversations on these topics can be found. The domain "lat.space" was even donated by a listener, adding a fun twist to the project.

As the conversation wraps up, Rah Habib encourages listeners to rate and review the podcast on platforms like Spotify and Apple Podcasts. Show notes and more episodes are available at "loop.com/podcast." Rah also invites feedback and ideas from listeners, providing an email address and social media handle for contact.

Conclusion

AI engineers are increasingly important, especially in vertical startups using proprietary data. Community engagement through events like the AI Engineer conference is vital for networking and strategy discussions.


Want to get your own summary?