Full Interview with DeepSeek Founder Liang Wenfeng: A Journey Driven by Curiosity and Idealism
This translation is brought to you by...DeepSeek. Enjoy!
This long interview was done six months ago, originally published on Chinese Wechat Media Waves, authored by Yu Lili and edited by Liu Sheng.
I read it today on Founder Park and admired the deeply self fulfilling journey driven by curiosity and innovation, so I translated the full interview using DeepSeek. I only edited a couple of words, much much less than when I used ChatGPT Plus to do translation between Chinese and English.
It offers a rare glimpse into the mindset and culture of a company pushing the boundaries of AI and AGI. The emphasis on curiosity, passion, and long-term thinking over short-term gains is both inspiring and thought-provoking. It challenges conventional business practices and offers a fresh perspective on what it takes to drive true innovation. (This summary is also written by DeepSeek, by the way.)
---
01 How did the price war start?
Waves: After the release of DeepSeek V2, it quickly triggered a fierce price war in the large model industry. Some say you are a disruptor in the industry.
Liang Wenfeng: We didn’t intend to be a disruptor; it just happened by accident.
Waves: Were you surprised by this outcome?
Liang Wenfeng: Very surprised. We didn’t expect pricing to be such a sensitive issue. We were just following our own pace and pricing based on cost calculations. Our principle is not to lose money, nor to seek excessive profits. The price we set is slightly above cost, with a small profit margin.
Waves: Five days later, Zhipu AI followed suit, and then ByteDance, Alibaba, Baidu, Tencent, and other major players joined in.
Liang Wenfeng: Zhipu AI only reduced the price of an entry-level product. Their models at the same level as ours are still very expensive. ByteDance was the first to truly follow suit. They lowered the price of their flagship model to match ours, which then triggered other major players to reduce their prices. Since the cost of these large companies’ models is much higher than ours, we didn’t expect anyone to lose money doing this. In the end, it turned into the logic of burning money for subsidies, reminiscent of the internet era.
Waves: From an external perspective, the price cuts seem like an attempt to grab users, which is typical of price wars in the internet era.
Liang Wenfeng: Grabbing users wasn’t our main goal. We lowered prices partly because our costs decreased as we explored the structure of the next-generation model, and partly because we believe that APIs and AI should be inclusive and affordable for everyone.
Waves: Before this, most Chinese companies would directly copy the Llama structure to build applications. Why did you choose to focus on the model structure instead?
Liang Wenfeng: If the goal is to build applications, then using the Llama structure and quickly launching products is a reasonable choice. But our destination is AGI (Artificial General Intelligence), which means we need to research new model structures to achieve stronger model capabilities with limited resources. This is one of the foundational studies required to scale up to larger models.
In addition to model structure, we’ve also conducted extensive research on other aspects, such as how to construct data and make models more human-like, all of which are reflected in the models we’ve released. Moreover, the Llama structure is estimated to be two generations behind international advanced levels in terms of training efficiency and inference costs.
Waves: Where does this generational gap mainly come from?
Liang Wenfeng: First, there’s a gap in training efficiency. We estimate that compared to the best international standards, there’s likely a twofold gap in model structure and training dynamics. This means we need to consume twice the computing power to achieve the same results. There’s also a similar gap in data efficiency, meaning we need twice the training data and computing power to achieve the same results. Combined, this means we need four times the computing power. What we’re doing is constantly working to narrow these gaps.
Waves: Most Chinese companies choose to focus on both models and applications. Why has DeepSeek chosen to focus solely on research and exploration for now?
Liang Wenfeng: Because we believe the most important thing right now is to participate in the wave of global innovation. For many years, Chinese companies have been accustomed to others driving technological innovation while we focus on monetizing applications. But this isn’t something that should be taken for granted. In this wave, our starting point isn’t to seize an opportunity to make a profit, but to advance to the forefront of technology and push the entire ecosystem forward.
Waves: The inertia left by the internet and mobile internet eras is that the U.S. excels in technological innovation, while China is better at applications.
Liang Wenfeng: We believe that as the economy develops, China should gradually become a contributor rather than always free-riding. Over the past three decades of IT waves, we’ve hardly participated in real technological innovation. We’ve grown accustomed to Moore’s Law dropping from the sky, with better hardware and software appearing every 18 months without effort. Scaling Law is being treated the same way.
But the truth, this is the result of generations of relentless effort by the Western tech community. We’ve overlooked its existence because we didn’t participate in this process before.
02 The Real Gap Lies Between Originality and Imitation
Waves: Why did DeepSeek V2 surprise so many people in Silicon Valley?
Liang Wenfeng: Among the vast amount of innovation happening daily in the U.S., this is a very ordinary one. What surprised them was that a Chinese company was joining their game as an innovator and contributor. After all, most Chinese companies are accustomed to following rather than innovating.
Waves: But in the Chinese context, this choice seems overly extravagant. Large models are a capital-intensive game, and not all companies have the resources to focus solely on research and innovation without prioritizing commercialization first.
Liang Wenfeng: The cost of innovation is certainly not low, and the past habit of borrowing ideas was tied to the circumstances of the time. But now, whether you look at China’s economic scale or the profits of giants like ByteDance and Tencent, they are not insignificant on a global level. What we lack in innovation is certainly not capital, but confidence and the know-how to organize high-density talent to achieve effective innovation.
Waves: Why do Chinese companies—including well-funded giants—so easily prioritize rapid commercialization above all else?
Liang Wenfeng: Over the past thirty years, we’ve only emphasized making money, while neglecting innovation. Innovation isn’t entirely driven by business; it also requires curiosity and a desire to create. We’ve been constrained by past inertia, but it was just a phase.
Waves: But you are ultimately a commercial organization, not a public research institution. By choosing innovation and sharing it through open source, where do you build your moat? For example, won’t the MLA architecture innovation you introduced in May 2024 be quickly copied by others?
Liang Wenfeng: In the face of disruptive technology, the moat formed by closed-source systems is short-lived. Even OpenAI, despite being closed-source, cannot prevent others from catching up. So we focus on building value within our team. Our colleagues grow through this process, accumulating a lot of know-how and forming an organization and culture capable of innovation—that is our moat.
Open-sourcing and publishing papers don’t mean we lose anything. For technical professionals, being followed is a deeply fulfilling experience. In fact, open source is more of a cultural act than a commercial one. Giving is an additional honor, and a company that does this also gains cultural appeal.
Waves: What do you think of market-belief perspectives like those of Zhu Xiaohu?
Liang Wenfeng: Zhu Xiaohu is self-consistent, but his approach is more suited to companies focused on quick profits. If you look at the most profitable companies in the U.S., they are all high-tech companies that have built up their capabilities over time.
Waves: But in the field of large models, pure technological leadership is hard to turn into an absolute advantage. What is the bigger thing you’re betting on?
Liang Wenfeng: What we see is that Chinese AI cannot remain in a follower position forever. We often say that Chinese AI is one or two years behind the U.S., but the real gap lies between originality and imitation. If this doesn’t change, China will always be a follower. So some exploration is inevitable.
NVIDIA’s leadership isn’t just the result of one company’s efforts but the collective work of the entire Western tech community and industry. They can see the next generation of technological trends and have a roadmap in hand. The development of Chinese AI also needs such an ecosystem. Many domestic chips fail to develop because they lack a supporting tech community and only have second-hand information. Therefore, it’s essential for someone in China to step up to the forefront of technology.
---
03 High-Flyer’s Goal in Building Large Models: Research and Exploration
Waves: High-Flyer decided to enter the field of large models. Why would a quantitative private fund undertake such an endeavor?
Liang Wenfeng: Our work on large models isn’t directly related to quantitative finance. We established a separate company called DeepSeek to focus on this. Many of the core members at High-Flyer come from an AI background. We explored many scenarios and eventually settled on the complexity of finance. General artificial intelligence (AGI) might be one of the next great challenges, so for us, it’s a question of how to do it, not why.
Waves: Are you aiming to train a general large model or one tailored to a specific vertical, like finance?
Liang Wenfeng: Our goal is AGI—artificial general intelligence. Large language models are likely a necessary step toward AGI and already exhibit some of its characteristics, so we’ll start there. Later, we’ll expand to areas like vision.
Waves: Due to the entry of major tech companies, many startups have abandoned the idea of focusing solely on general-purpose large models.
Liang Wenfeng: We won’t rush into designing applications based on the model. Instead, we’ll stay focused on the large model itself.
Waves: Many believe that for startups, entering the field after major players have established consensus is no longer a good timing.
Liang Wenfeng: At this point, neither major companies nor startups can easily build a crushing technical advantage in the short term. With OpenAI leading the way and everyone building on publicly available papers and code, both major players and startups will likely have their own large language models ready by next year. Both have opportunities. However, vertical scenarios aren’t in the hands of startups, making this phase less friendly to them. But because these scenarios are ultimately fragmented and niche, they are better suited to agile startup organizations.
In the long run, the barriers to applying large models will continue to lower, and startups will have opportunities to enter at any point in the next 20 years. Our goal is clear: we’re not focusing on verticals or applications but on research and exploration.
Waves: Why do you define your mission as “research and exploration”?
Liang Wenfeng: It’s driven by curiosity. On a broader level, we want to test certain hypotheses. For example, we believe that human intelligence might fundamentally be about language—that human thought might simply be a process of language. What you perceive as thinking might just be your brain weaving language. This suggests that human-like AI (AGI) could emerge from large language models. On a more immediate level, GPT-4 still holds many mysteries. While we replicate it, we’ll also conduct research to uncover its secrets.
Waves: But research means incurring higher costs.
Liang Wenfeng: If we were only replicating, we could rely on public papers or open-source code, requiring minimal training or just fine-tuning, which is low-cost. Research, however, involves extensive experiments and comparisons, demanding more computing power and higher expertise, so the costs are significantly higher.
Waves: Where does the research funding come from?
Liang Wenfeng: High-Flyer, as one of our backers, has ample R&D budgets. Additionally, we have an annual donation budget of several hundred million Yuan, which has traditionally gone to public welfare organizations. If needed, we can adjust this allocation.
Waves: But building foundational large models requires at least a few hundred million dollars just to get a seat at the table. How do you sustain such continuous investment?
Liang Wenfeng: We’re in discussions with various investors. Many VCs are hesitant about funding pure research because they need exits and want quick commercialization. Given our research-first approach, securing VC funding is challenging. However, we already have computing power and an engineering team, which gives us half the leverage we need.
Waves: What kind of business models have you envisioned?
Liang Wenfeng: We’re considering openly sharing most of our training results, which could align with commercialization. We hope that more people, even small app developers, can access large models at low costs, rather than having the technology monopolized by a few individuals or companies.
Waves: Some major companies will also offer services later. What differentiates you?
Liang Wenfeng: Major companies’ models might be tied to their platforms or ecosystems, whereas we are completely free.
Waves: Regardless, it seems somewhat crazy for a commercial company to engage in such an open-ended, research-driven exploration.
Liang Wenfeng: If you’re looking for a purely commercial rationale, you might not find one because it’s not cost-effective. From a business perspective, basic research has a low return on investment. When OpenAI’s early investors put in money, they weren’t thinking about returns but about genuinely wanting to advance this field. What we’re certain of is that since we want to do this, have the capability, and the timing is right, we are one of the most suitable candidates.
04 The 10,000 GPU Reserve Was Driven by Curiosity
Waves: GPUs have become a scarce resource in the ChatGPT-driven entrepreneurial wave. Yet, as early as 2021, you had the foresight to stockpile 10,000 GPUs. Why?
Liang Wenfeng: This process happened gradually—from owning just one GPU in the early days, to 100 in 2015, 1,000 in 2019, and eventually 10,000. When we had only a few hundred GPUs, we hosted them in IDCs (Internet Data Centers). But as the scale grew, hosting became insufficient, so we started building our own server rooms. Many people might think there’s some hidden business logic behind this, but in reality, it was mainly driven by curiosity.
Waves: What kind of curiosity?
Liang Wenfeng: Curiosity about the boundaries of AI capabilities. For outsiders, the ChatGPT wave feels like a massive shock, but for insiders, the real paradigm shift began with AlexNet in 2012. AlexNet’s error rate was far lower than other models at the time, reviving neural network research after decades of dormancy. While specific technical directions have kept changing, the combination of models, data, and computing power has remained constant. Especially after OpenAI released GPT-3 in 2020, the direction became clear: massive computing power was needed. But even in 2021, when we invested in building Firefly II, most people still couldn’t understand why.
Waves: So you’ve been paying attention to computing power reserves since 2012?
Liang Wenfeng: For researchers, the thirst for computing power is endless. After conducting small-scale experiments, you always want to scale up. Since then, we’ve consciously deployed as much computing power as possible.
Waves: Many people assume that building this computing cluster was for quantitative private fund business, using machine learning for price prediction.
Liang Wenfeng: If we were solely focused on quantitative investing, a small number of GPUs would suffice. Beyond investing, we’ve conducted extensive research to understand what paradigms can fully describe financial markets, whether there are simpler expressions, where the boundaries of these paradigms lie, and whether they have broader applicability.
Waves: But this process is also a money-burning endeavor.
Liang Wenfeng: Something exciting can’t always be measured purely in monetary terms. It’s like buying a piano for your home—you do it because you can afford it, and because there are people eager to play it.
Waves: GPUs typically depreciate at a rate of 20%.
Liang Wenfeng: We haven’t calculated it precisely, but it’s probably not that high. NVIDIA GPUs are like hard currency—even older models are still widely used. When we decommissioned older GPUs, they still fetched a good price on the second-hand market, so we didn’t lose much.
Waves: Building a computing cluster involves significant maintenance costs, labor expenses, and even electricity bills.
Liang Wenfeng: Electricity and maintenance costs are actually quite low, accounting for only about 1% of the hardware cost annually. Labor costs are higher, but they’re also an investment in the future—our people are the company’s greatest asset. We select individuals who are relatively down-to-earth, curious, and eager to conduct research here.
Waves: In 2021, High-Flyer was among the first in the Asia-Pacific region to acquire A100 GPUs. Why were you earlier than some cloud providers?
Liang Wenfeng: We had already conducted pre-research, testing, and planning for the new GPUs. As for cloud providers, from what I know, their demand was scattered until 2022, when autonomous driving companies needed to rent machines for training and had the ability to pay. Only then did some cloud providers build out their infrastructure. Large companies rarely engage in pure research or training—they’re more driven by business needs.
Waves: How do you view the competitive landscape of large models?
Liang Wenfeng: Major companies certainly have advantages, but if they can’t quickly apply their models, they might not sustain their efforts because they need to see results. Leading startups also have solid technical foundations, but like the previous wave of AI startups, they face commercialization challenges.
Waves: Some people might think that a quantitative fund emphasizing its AI work is just creating hype for other businesses.
Liang Wenfeng: But in reality, our quantitative fund has largely stopped raising external capital.
Waves: How do you distinguish between true AI believers and opportunists?
Liang Wenfeng: True believers were here before and will remain here afterward. They’re more likely to buy GPUs in bulk or sign long-term agreements with cloud providers, rather than renting short-term.
---
05 The Development of the V2 Model Was Driven by Local Talent
Waves: Jack Clark, former policy lead at OpenAI and co-founder of Anthropic, described DeepSeek as having hired "a group of enigmatic geniuses." What kind of team created DeepSeek V2?
Liang Wenfeng: There are no enigmatic geniuses here. The team consists of fresh graduates from top universities, PhD students in their fourth or fifth year, interns, and young professionals who graduated just a few years ago.
Waves: Many large model companies are obsessed with recruiting talent from overseas. Some believe that the top 50 talents in this field are likely not working for Chinese companies. Where does your team come from?
Liang Wenfeng: The V2 model was developed entirely by local talent—no one returned from overseas. While the top 50 talents might not be in China, perhaps we can cultivate such talent ourselves.
Waves: How did the MLA innovation come about? I heard the idea originally came from a young researcher’s personal interest?
Liang Wenfeng: After summarizing some of the mainstream evolutionary patterns of the Attention architecture, he had a sudden inspiration to design an alternative. However, turning that idea into reality was a long process. We formed a dedicated team and spent several months getting it to work.
Waves: The birth of such divergent ideas is closely tied to your entirely innovation-driven organizational structure. Even during the High-Flyer era, you rarely assigned goals or tasks top-down. But for frontier explorations like AGI, which are full of uncertainty, do you introduce more management practices?
Liang Wenfeng: DeepSeek is also entirely bottom-up. We generally don’t pre-assign roles but let them emerge naturally. Everyone has their unique background and ideas, so there’s no need to push them. During exploration, when someone encounters a problem, they’ll naturally gather others to discuss it. However, when an idea shows potential, we do allocate resources top-down.
Waves: I’ve heard that DeepSeek is very flexible in allocating GPUs and personnel.
Liang Wenfeng: There’s no cap on how many GPUs or people someone can mobilize. If someone has an idea, they can immediately access the training cluster’s GPUs without approval. Since there are no hierarchies or cross-departmental barriers, they can also freely involve anyone, as long as the other person is interested.
Waves: This loose management style depends on having a group of highly passionate individuals. I’ve heard you’re skilled at identifying talent through unconventional criteria, allowing people who might not stand out in traditional evaluations to shine.
Liang Wenfeng: Our hiring criteria have always been passion and curiosity, so many people here have unique and interesting backgrounds. For many, the desire to conduct research far outweighs their concern for money.
Waves: The Transformer was born at Google’s AI Lab, and ChatGPT at OpenAI. What do you think is the difference in the value of innovation between a large company’s AI Lab and a startup?
Liang Wenfeng: Whether it’s Google’s lab, OpenAI, or even the AI Labs of major Chinese companies, they all hold significant value. The fact that OpenAI ultimately succeeded also has an element of historical serendipity.
—
06 Established Patterns Are Products of the Past and May Not Hold in the Future
Waves: Is innovation largely a matter of chance? I noticed that the row of meeting rooms in your office has doors on both sides that can be easily pushed open. Your colleagues mentioned that this design leaves room for serendipity. For example, the creation of the Transformer involved someone casually overhearing a discussion, joining in, and ultimately helping turn it into a universal framework.
Liang Wenfeng: I believe innovation is first and foremost a matter of belief. Why is Silicon Valley so innovative? Because they dare to try. When ChatGPT emerged, there was a lack of confidence in frontier innovation across China, from investors to major companies. Everyone felt the gap was too large and opted to focus on applications instead. But innovation requires confidence, and this confidence is often more evident in young people.
Waves: But you don’t participate in fundraising and rarely make public statements. In terms of social visibility, you’re certainly less prominent than companies that actively raise funds. How do you ensure DeepSeek is the top choice for those working on large models?
Liang Wenfeng: Because we’re tackling the hardest problems. What attracts top talent the most is the opportunity to solve the world’s most challenging problems. In fact, top talent in China is undervalued. There’s too little hardcore innovation at the societal level, so they rarely get recognized. By working on the hardest problems, we naturally attract them.
Waves: Recently, OpenAI’s release didn’t include GPT-5, leading many to believe the technology curve is slowing significantly. Some have even started questioning the Scaling Law. What’s your take?
Liang Wenfeng: We’re relatively optimistic. The industry seems to be progressing as expected. OpenAI isn’t infallible; they can’t always stay ahead.
Waves: How long do you think it will take to achieve AGI? Before releasing DeepSeek V2, you released models for code generation and mathematics, and you switched from dense models to MoE (Mixture of Experts). What milestones are on your AGI roadmap?
Liang Wenfeng: It could be 2 years, 5 years, or 10 years, but it will definitely happen within our lifetime. As for the roadmap, even within our company, there’s no consensus. But we’re betting on three directions: mathematics and code, multimodal capabilities, and natural language itself. Mathematics and code are natural testing grounds for AGI—like Go, they’re closed, verifiable systems where high intelligence might be achieved through self-learning. On the other hand, multimodal learning and interacting with the real world may also be necessary for AGI. We remain open to all possibilities.
Waves: What do you think the endgame for large models will look like?
Liang Wenfeng: There will be specialized companies providing foundational models and services, with a long chain of professional divisions. More people will build on these to meet society’s diverse needs.
Waves: Over the past year, there have been many changes in China’s large model startup scene. For example, Wang Huiwen, who was very active at the beginning of last year, stepped away midway, and later entrants have started to differentiate themselves.
Liang Wenfeng: Wang Huiwen bore all the losses himself, allowing others to exit unscathed. He made the choice that was worst for himself but best for everyone else. I admire his integrity.
Waves: Where are you focusing most of your energy now?
Liang Wenfeng: Most of my energy is spent researching the next generation of large models. There are still many unsolved problems.
Waves: Other large model startups insist on balancing both research and commercialization, as technology alone doesn’t guarantee lasting advantage. Capturing the time window to translate technological advantages into products is also crucial. Is DeepSeek’s focus on model research due to insufficient model capabilities?
Liang Wenfeng: All established patterns are products of the past and may not hold in the future. Applying internet business logic to future AI profit models is like discussing General Electric and Coca-Cola during Pony Ma’s early entrepreneurial days. It’s likely a case of “seeking a sword from a boat’s mark”—outdated thinking.
Waves: High-Flyer has always had a strong technological and innovative DNA, and its growth has been relatively smooth. Is this why you’re optimistic?
Liang Wenfeng: High-Flyer has, to some extent, strengthened our confidence in technology-driven innovation, but it hasn’t been all smooth sailing. We’ve gone through a long accumulation process. What the outside world sees is High-Flyer post-2015, but we’ve actually been at it for 16 years.
Waves: Returning to the topic of original innovation. With the economy entering a downturn and capital entering a cold cycle, will this further suppress original innovation?
Liang Wenfeng: I don’t think so. The adjustment of China’s industrial structure will rely more on hardcore technological innovation. When many realize that past quick profits were largely due to luck, they’ll be more willing to roll up their sleeves and engage in real innovation.
Waves: So you’re optimistic about this as well?
Liang Wenfeng: I grew up in the 1980s in a fifth-tier city in Guangdong. My father was an elementary school teacher. In the 1990s, there were many money-making opportunities in Guangdong, and many parents came to our house, essentially saying that education was useless. But looking back now, those views have changed. Because making money has become harder, even opportunities like driving a taxi have dried up. It only took one generation for this shift.
In the future, hardcore innovation will become more common. It might not be easily understood now because society as a whole needs to be educated by facts. When hardcore innovators achieve success and recognition, collective thinking will change. We just need more examples and time.
---
07 More Investment Doesn’t Necessarily Lead to More Innovation
Waves: DeepSeek currently has an idealistic vibe reminiscent of OpenAI’s early days, and it’s also open source. Will you choose to go closed-source later? Both OpenAI and Mistral have transitioned from open source to closed source.
Liang Wenfeng: We won’t go closed source. We believe building a strong technological ecosystem is more important.
Waves: Do you have plans to raise funding? Some media reports suggest that High-Flyer has plans to spin off DeepSeek for an independent IPO. AI startups in Silicon Valley also inevitably end up aligning with major tech companies.
Liang Wenfeng: We have no plans to raise funds in the short term. Our challenge has never been money but the embargo on high-end chips.
Waves: Many believe that working on AGI and quantitative finance are two entirely different things. Quantitative finance can be done quietly, but AGI might require a more high-profile approach, forming alliances to amplify your investments.
Liang Wenfeng: More investment doesn’t necessarily lead to more innovation. Otherwise, major companies could monopolize all innovation.
Waves: Are you avoiding applications because you lack operational expertise?
Liang Wenfeng: We believe the current phase is a period of technological innovation, not application development. In the long run, we hope to create an ecosystem where the industry directly uses our technology and outputs. We’ll focus on foundational models and cutting-edge innovation, while other companies build toB and toC businesses on top of DeepSeek. If a complete industrial chain can be formed, there’s no need for us to develop applications ourselves. Of course, if needed, we can develop applications, but research and technological innovation will always be our top priority.
Waves: But if choosing APIs, why choose DeepSeek over major companies?
Liang Wenfeng: The future will likely involve specialized divisions of labor. Foundational large models require continuous innovation, and major companies have their limitations—they might not be the best fit.
Waves: But can technology really create a gap? You’ve also said there are no absolute technological secrets.
Liang Wenfeng: Technology has no secrets, but replication takes time and cost. NVIDIA’s GPUs, for example, theoretically have no technological secrets and could be easily copied. But reorganizing a team and catching up to the next generation of technology takes time, so the actual moat is still wide.
Waves: After your price cuts, ByteDance was the first to follow suit, indicating they felt some level of threat. What’s your take on new approaches for startups competing with major companies?
Liang Wenfeng: To be honest, we don’t care much about this. We just happened to do it. Providing cloud services isn’t our main goal. Our goal is still to achieve AGI.
I don’t see any new approaches yet, but major companies don’t have a clear advantage either. They have existing users, but their cash flow businesses are also their burdens, making them vulnerable to disruption at any time.
Waves: What do you think about the endgame for the six large model startups besides DeepSeek?
Liang Wenfeng: Maybe 2 or 3 will survive. Right now, everyone is still in the money-burning phase, so those with clear self-positioning and more refined operations have a better chance of surviving. Other companies might undergo transformations. Valuable things won’t disappear—they’ll just take on a different form.
Waves: During the High-Flyer era, your approach to competition was described as “doing your own thing,” rarely caring about horizontal comparisons. What’s the principle of your thinking about competition?
Liang Wenfeng: What I often think about is, whether something can improve societal efficiency and whether you can find a position in its industrial chain that aligns with your strengths. As long as the end result improves societal efficiency, it’s valid. Many things in between are just phases, and over-focusing on them can be overwhelming.
—
08 Innovation Emerges Naturally—It’s Not Planned or Taught
Waves: How is the recruitment progress for the DeepSeek team?
Liang Wenfeng: The initial team is already in place. In the early stages, due to a lack of manpower, we temporarily borrowed some people from High-Flyer. We started recruiting when ChatGPT 3.5 went viral at the end of last year, but we still need more people to join.
Waves: Talent for large model startups is also scarce. Some investors say that many suitable candidates might only be found in AI labs of giants like OpenAI or Facebook AI Research. Will you recruit such talent from overseas?
Liang Wenfeng: If you’re pursuing short-term goals, hiring experienced people is the right move. But in the long run, experience is less important—foundational skills, creativity, and passion matter more. From this perspective, there are plenty of suitable candidates domestically.
Waves: Why is experience less important?
Liang Wenfeng: It’s not necessarily true that only someone who has done something before can do it. At High-Flyer, one of our hiring principles is to focus on ability, not experience. Most of our core technical roles are filled by fresh graduates or those who graduated one or two years ago.
Waves: In innovative ventures, do you think experience can be a hindrance?
Liang Wenfeng: When doing something, experienced people will instinctively tell you how it should be done, while those without experience will explore, think carefully, and find a solution that fits the current reality.
Waves: High-Flyer entered the quantitative finance industry as complete outsiders with no financial background and became a top player within a few years. Is this hiring philosophy one of the secrets to your success?
Liang Wenfeng: Our core team, including myself, initially had no quantitative experience, which is quite unique. I wouldn’t call it the secret to success, but it’s part of High-Flyer’s culture. We don’t intentionally avoid experienced people, but we prioritize ability.
Take sales as an example. Two of our main salespeople were complete newcomers to the industry. One previously worked in German machinery exports, and the other wrote backend code for a securities firm. When they entered this industry, they had no experience, no resources, and no connections.
Now, we might be the only major private fund that relies primarily on direct sales. Direct sales mean no fees for intermediaries, resulting in higher profit margins for the same scale and performance. Many have tried to imitate us but haven’t succeeded.
Waves: Why have many failed to imitate you?
Liang Wenfeng: Because that alone isn’t enough to drive innovation. It needs to align with the company’s culture and management. In fact, in their first year, they achieved nothing, and only started seeing results in the second year. But our evaluation criteria are different from most companies. We don’t have KPIs or assigned tasks.
Waves: What are your evaluation criteria, then?
Liang Wenfeng: Unlike typical companies that focus on client order volume, our salespeople’s compensation isn’t tied to predefined targets. Instead, we encourage them to build their own networks, meet more people, and create greater influence. We believe that a trustworthy, honest salesperson might not immediately secure orders but can make clients feel they’re reliable.
Waves: Once you’ve found the right people, how do you get them up to speed?
Liang Wenfeng: Give them important tasks and don’t interfere. Let them figure things out and express themselves. A company’s DNA is hard to imitate. For example, hiring inexperienced people, assessing their potential, and fostering their growth—these things can’t be directly copied.
Waves: What do you think are the essential conditions for building an innovative organization?
Liang Wenfeng: Our conclusion is that innovation requires minimal intervention and management, giving everyone the freedom to explore and room to make mistakes. Innovation often emerges naturally—it’s not planned or taught.
Waves: This is a very unconventional management style. How do you ensure efficiency and alignment with your goals in such an environment?
Liang Wenfeng: Ensure shared values during recruitment, and then align everyone through company culture. Of course, we don’t have a written company culture because anything formalized can hinder innovation. More often, it’s about leading by example—how you make decisions becomes the standard.
Waves: Do you think that in this wave of large model competition, the innovative organizational structure of startups could be a breakthrough point against major companies?
Liang Wenfeng: If you follow textbook methodologies, what startups are doing today wouldn’t survive. But the market is changing. The real deciding factor isn’t predefined rules or conditions but the ability to adapt and adjust. Many large companies’ structures can’t respond or act quickly, and their past experiences and inertia can become constraints. Under this new AI wave, a new batch of companies will inevitably emerge.
Waves: What excites you most about doing this?
Liang Wenfeng: Figuring out whether our hypotheses are true. If they are, it’s incredibly exciting.
Waves: What are the non-negotiable criteria for hiring in this large model initiative?
Liang Wenfeng: Passion and solid foundational skills. Nothing else is as important.
Waves: Are such people easy to find?
Liang Wenfeng: Their enthusiasm usually shows because they genuinely want to do this. Often, these people are also looking for you.
Waves: Large models might require endless investment. Does the cost ever make you hesitate?
Liang Wenfeng: Innovation is expensive and inefficient, often accompanied by waste. That’s why innovation only emerges when the economy reaches a certain level. When resources are scarce or industries aren’t innovation-driven, cost and efficiency are critical. Look at OpenAI—it burned a lot of money to get where it is.
Waves: Do you feel like you’re doing something crazy?
Liang Wenfeng: I don’t know if it’s crazy, but there are many things in this world that can’t be explained by logic. Like many programmers who are avid contributors to open-source communities—they’re exhausted after a long day but still contribute code.
Waves: There’s a kind of intrinsic reward in that.
Liang Wenfeng: It’s like hiking 50 kilometers—your body is exhausted, but your spirit is fulfilled.
Waves: Do you think curiosity-driven madness can last forever?
Liang Wenfeng: Not everyone can stay crazy forever, but most people, in their younger years, can fully devote themselves to something without any utilitarian purpose.
---
@Ollie Forsyth this would be something that your audience would like!
Thank you. This is super interesting!