Deep Dive: Alibaba, the company behind Qwen
Is Alibaba's Qwen the Open-Source AI Winner?
Last week, I published a deep dive into Alibaba’s AI strategy on Michael Spencer’s AI Supremacy, and it really helped me find some confidence and courage to keep writing. Starting a newsletter from scratch without a single imported contact has sometimes been lonely and discouraging. But I’m so grateful for Michael and the community because it’s only been 2 to 3 months of active writing/ self-publishing, and I’ve officially passed the 100 subscribers mark! So I’m telling myself, let’s take it day by day, topic by topic, and keep learning and writing. Baby steps, one 100-subscribers at a time.
Thank you for the kind words and encouragement, Michael. Wooo!
For those of you who have never met me in person, I hope to connect with you online or offline! Here’s a short intro video I made for the guest post.
“Qwen2.5 is so far ahead of other LLMs that it has become normal not to include it in evaluations.” - Benjamin Marie (October 28th, 2024).
Michael Spencer wrote the below as an intro to my piece:
On November 11th, 2024, Alibaba Cloud’s Qwen team released Qwen2.5-Coder-32B-Instruct. The performance of this model surprised many people but not those of us who were already bullish on Qwen. The flagship model, Qwen2.5-Coder-32B-Instruct, reaches top-tier performance, highly competitive (or even surpassing) proprietary models like GPT-4o, in a series of benchmark evaluations, including HumanEval, MBPP, LiveCodeBench, BigCodeBench, McEval, Aider, etc. It reaches 92.7 in HumanEval, 90.2 in MBPP, 31.4 in LiveCodeBench, 73.7 in Aider, 85.1 in Spider, and 68.9 in CodeArena.
Open-source (technically Open-weight) large language models are an exciting domain because they can lead to many real-world applications and a democratization of Generative AI globally. The idea and continuous debate of how many months behind open-source models are to proprietary frontier and closed-source models fascinates me, as does the axis between centralization and decentralization. China BigTech needs to follow the enormous capex investments of the West if they can. Thus, Tencent and Alibaba are upping their game with Baidu now, and Alibaba Cloud’s Qwen2.5-Coder-32B-Instruct feels almost like a new paradigm for the ceiling of Open-Source LLMs heading into 2025.
Now…
Alibaba, the company behind Qwen
A deep dive into Alibaba’s AI strategy, the company behind the Qwen series
Despite chip bans from the U.S., China’s AI ecosystem has outperformed many expectations, especially garnering international attention from developers as Alibaba’s open-source Qwen series has been widely adopted and been widely talked about in the community.
China has created a completely separate AI ecosystem for various reasons:
1) to lessen dependency on the West and 2) the Great Firewall censorship constraints, but that is not to say that innovation is stifled. In fact, there is an extremely vibrant set of players in China right now across the AI ecosystem, and today, we will dive deep into the role Alibaba plays in the space.
Leadership Reshuffle: Alibaba Doubles Down on AI as Joe Tsai Returns
Most people know that Alibaba is an e-commerce company founded by the flamboyant billionaire Jack Ma, but Alibaba actually has a wide range of businesses. The e-commerce juggernaut was once the unquestionable market leader, but it has faced fierce competition from newcomers such as Pinduoduo and ByteDance’s Douyin.
This time around, it is making sure that it will not be complacent and is going all in on AI with cloud computing, its own proprietary LLM model, innovation in AI application, and investment into the whole ecosystem, as “AI is too important of a path to just go one direction,” said Joe Tsai, Chairman of Alibaba Group at a recent JP Morgan event.
For a while, doubts about the company’s future arose after the founder’s notorious Shanghai Bund talk, which led to a series of high-profile government probes into the business. Two years of volatility ended with a series of internal personnel changes amid a slowing domestic consumption environment. The company’s share price tumbled to USD 57 in October 2022, the lowest it’s ever been.
For context, the previous CEO, Daniel Zhang, rose from Alibaba’s corporate ladder ranks and mostly prioritized the e-commerce business. He was notably known for creating Alibaba’s Singles Day and as the architect of its international e-commerce strategies and innovations. He was always very pragmatic and focused on what he often referred to as the "flywheel models” to describe Alibaba's growth strategy and operational dynamics. What he meant in layman's terms was a self-reinforcing cycle where the interconnectedness of its e-commerce platforms, data-driven insights, and continuous reinvestment in technology and services would create synergy and thus sustainable growth and competitive advantage against others in the race. Although the cloud was a key business for him, it was mostly seen as a computing service provider, and it wasn't until more recently that Alibaba more publicly came out to advocate for AI integration and put the cloud front and center of its business strategy.
After nearly ten years at the realm, in late 2023, Daniel Zhang stepped down from his CEO position and passed over the baton to Eddie Wu. And Joe Tsai reemerged in the public. This was a huge turning point for the company’s AI strategy.
Joe Tsai’s sudden return to the front and center of the company’s daily operations in 2023 came as a surprise to many. When he returned as the Chairperson of Alibaba Group, he seemed to have brought back his business intuition as well as his observations from California – where he was residing for most of COVID – back to the Hangzhou-based company.
Joe has always been a compelling spokesperson, a charismatic speaker and storyteller, but when he was partnered with Jack, he was seen as the “posh and relatively more reserved one” where Jack was the company’s visionary, spiritual leader and loudest advocate. However, now things have changed, in contrast to his new partner in crime Eddie Wu, who is reserved and media-shy, Joe is happily telling the company’s story, going on podcasts and panels across the region to boost investor confidence, especially highlighting the company’s ambitious AI strategy.
Joe has so much confidence and conviction in the company’s new found focus that he has apparently now even started a mantra internally, hyping employees up with a chant: “Baba will reach 200 again,” which implies that $BABA’s stock price will shoot up to $200 again, (in the last two years it dropped from its peak of over USD300 to ~USD100).
Alibaba’s AI Playbook
Alibaba is invested in AI in five major ways with a twin-strategy
End-to-End tech stack strategy
Building proprietary LLM - Qwen and offering its LLMs to AI builders
Cloud computing service
Designing chips catered for processing AI applications.
Ecosystem Strategy
Implementing AI into its existing consumer facing applications
Funding AI companies across the ecosystem
Alibaba is easily the most well-known Chinese tech company internationally with a leading cloud business and its own proprietary LLM technology. Although in China, Baidu and Huawei each have their own models and cloud service, Baidu’s data focused strategy has always been more focused on its autonomous driving technology and Huawei has always been more focused on compute and hardware, whereas its LLM is seen to be more a “nice to have” add on for enterprise clients.
In contrast, Alibaba has repeatedly said that it aims to “make AI accessible to all.” At the 2024 Apsara Conference, Alibaba CEO Eddie Wu emphasized that the company is committed to supporting the open-source ecosystem from chips, servers, networks, to storage and data centers.
Proprietary LLM: Tongyi Qianwen (Qwen)
At the forefront of Alibaba's AI offerings is Tongyi Qianwen, a large language model akin to a "super chatbot." This advanced model is capable of understanding and generating text, making it suitable for a wide range of applications, including article generation, conversational responses, and customer support.
The Qwen series - have incredible scale, performance across benchmarks, multimodal features, and commitment to accessibility for a wide range of users. And Alibaba has made this technology publicly available, allowing other businesses to utilize it for free to enhance their customer service capabilities.
“It is the most-competitive Chinese LLM when compared to the likes of GPT4/4.o in terms of its overall performance,” said Leo Jiang, founder of GroundAI and former Huawei Chief Digital Officer. He added, what makes Qwen special is because of its two formats, “its API driven LLM service offers quicker time to market, and cost effectiveness. Whereas its open-source version gives more control and privacy to its clients.”
Alibaba launched its large language development tool Tongyi Qianwen in 2023 and it is often referred to as Qwen and it is now at its 2.5 iteration. The Qwen models, including the Qwen-72B and Qwen-1.8B, are notable for their diverse parameter sizes—ranging from 1.8 billion to 72 billion parameters—and their multimodal capabilities, which allow them to process not just text but also audio and visual data. This flexibility is enhanced by their training on over 3 trillion tokens, enabling them to outperform many other open-source models across various benchmarks, including multitask accuracy and code generation capabilities.
Qwen has positioned itself as an all-around AI assistant, with five key application use-cases:
real-time meeting transcription and summaries
processing lengthy content and providing summaries that require complicated comprehension
AI PowerPoint presentation creation
real-time simultaneous translation
video chat with an AI agent that can provide problem solving.
Source: Alibaba
The uniqueness of Qwen lies in its impressive technology and strong commitment to open-source principles, as Alibaba makes various versions of its models available on platforms like Hugging Face and ModelScope. This approach fosters a collaborative environment where developers can experiment and innovate, democratizing access to advanced AI technologies for businesses of all sizes.
In particular, companies with fewer than 100 million monthly active users can use these models for free, promoting wider adoption across industries. And by supporting the growth of the open-source community, Alibaba has aimed to empower users to effectively harness AI capabilities while reducing reliance on proprietary technologies.
translated the well-circulated AItechtalk article on why Qwen is the world’s most popular open-source large model right now, which wrote that, “per Hugging Face data, the Qwen series/bloodline of models has reached more than 50,000. That is, developers around the world have trained more than 50,000 derivative models based on the Qwen series base, second only to the Llama series of about 70,000. This data is the most convincing indicator for judging the ecosystem-level influence of a model.”Impressively, the Qwen models have garnered significant interest from across sectors, including automotive, gaming, and scientific research last year. The models have been downloaded over 40 million times since their introduction. Additionally, the lightweight Qwen-1.8B model is designed for deployment on edge devices such as smartphones, making it an attractive option for applications requiring lower computational resources.
The most recent comprehensive upgrade of Qwen2.5 means larger parameter scale, more powerful comprehension of photos and videos, a large-scale audio language model and continued open source models. Not only has it been improved drastically, the cost of strong inference capabilities to support complex tasks have been reduced for both Qwen-Plus and Qwen-Turbo.
Looking ahead, CEO Eddie Wu noted that while AI development has progressed rapidly, AGI (Artificial General Intelligence) is still in its early stages. He emphasized the importance of collaboration and highlighted that the API inference cost for Tongyi Qianwen has dropped by 97% year-on-year, a key factor contributing to its growing popularity. In fact, this is verified by Leo, the former Huawei executive who noted that the Qwen models offer higher accuracy and factuality compared to most other models based in China. It can be customized for enterprise use cases that prioritize the accuracy of outputs and aim to minimize model hallucinations and in addition, Qwen’s biggest edge right now is that it is providing developers a powerful yet cost-effective alternative.
Alibaba Cloud
AI and the cloud business is like the left hand and the right hand, said Joe Tsai in a podcast speaking with Norwegian hedge fund manager Nicolai Tangen. As mentioned earlier, anyone can use Alibaba’s LLM through APIs, or directly go to its open-source model. However, for any of them who want to deploy Qwen they would need cloud computing power and Alibaba Cloud is there to provide that.
In fact, currently 80% of China’s technology companies and half of the country’s large model companies run on Alibaba Cloud. This scale is simply unmatchable. Joe reiterated that with its cloud service as the largest provider in APAC, Alibaba has a huge advantage in garnering data and trial for its Tongyi Qianwen. The positive cycle allows the two businesses across the AI layers to continuously feed into each other.
In addition, the company has created the largest open-source community called ModelScope which hosts many other open-source models on the marketplace and when developers use those open-source models, they will also need compute power, which has become a main driver for Alibaba’s cloud revenue.
By providing the cloud infrastructure to the startups, the tech giant is hoping to hedge its bets by allowing them to access the best consumer-facing application first hand. Providing the cloud infrastructure would enable the company to access a diverse pool of data across domains and use cases which it could potentially leverage to finetune its own models if given the permission. It would also mean talent acquisition and exposure to new innovation in the field will be more accessible.
Alibaba’s AI Applications
So let’s take a look at the application front. Alibaba has integrated AI into its own operations extensively, utilizing it for product recommendations on its e-commerce platform, intelligent customer service, AI empowered advertisement targeting, and AI-driven solutions in cloud services. Additionally, it is looking for ways to better use AI to enhance logistics efficiency and other use cases as well. Today, let’s just take a look at a few mature ones first.
The Artificial Intelligence Online Serving (AI OS) is a platform developed by the company's search engineering team. AI OS integrates personalized search, recommendation, and advertising, supporting various business scenarios across Alibaba's platforms, mostly focusing on the marketplace apps such as Taobao. The technology originally focused on Taobao's search capabilities has expanded to include deep learning technologies and various engines for search and recommendation.
Dingtalk is an enterprise chat software, similar to Slack. Across Dingtalk, all products have been AI-enabled with an embedded AI agent for enterprise and personal use which was launched at the beginning of 2024. The AI agent is a virtual robot that can examine data analytics and is equipped with memory, planning and execution capabilities.
The format to interact with the agent is through a chatbot similar to ChatGPT. The company’s suggested use cases include using the robot as a sales person, IT, HR administrative, financial or procurement staff and it can help companies automate many of the repetitive tedious tasks within the management process.
Meanwhile, Alimama is a platform that helps brands with ad-optimization on Alibaba’s ecommerce marketplace apps - Tmall/ Taobao. Alimama is a relatively unknown business unit of Alibaba but it was actually founded very early on in 2007. It is a digital marketing platform for businesses that are selling on the Taobao or Tmall platforms. The AI-empowered multi-media LMA was launched in April this year and has been fully applied to 2B applications now. The tools include AI sales agents that can handle client enquiries and basic ad design functions that can help improve efficiency and quality, sales analytics for budgeting and pricing purposes and inventory management which have all contributed to an improvement in ROI and text to picture or video generation ad services is also provided by Alimama with relatively low costs. And the company claims to have served over 1 million merchants on the platform and significantly reduced advertising production costs.
Investing to Capture All Possibilities (Opportunities)
Alibaba has actively acquired and invested in several promising AI companies across the layers, particularly those specializing in AI chip development and LLM developers. These strategic moves are aimed at expanding Alibaba's opportunities in the rapidly evolving AI landscape.
And in 2024 alone, Alibaba has led major funding rounds for multiple AI firms, including a $1 billion investment in Moonshot AI, which has seen its valuation soar to approximately $2.5 billion; $691 million funding round for Baichuan, raising its valuation to around $2.8 billion; and a commitment of > $600 million to MiniMax, which is three out of four of the so called, “tigers.”
Currently, the four most valuable AI startups in China have been nicknamed “The Four AI (small) Tigers”, while all of them have been founded within the last three to five years and already achieving monumental success with Moonshot to be valued at $3 billion, Minimax valued at $2 billion, Zhipu AI raising nearly $800 million and Baichuan said to be valued near $2 billion.
Alibaba’s Chips: T-Head
Last and often overlooked is Alibaba’s efforts in hardware development. News flash, Huawei is not the only Chinese big tech developing chip hardware.
Alibaba's chip venture, T-Head, is making significant strides in the development of RISC-V architecture as part of China's broader push for semiconductor self-sufficiency amid ongoing U.S. trade restrictions. T-Head has focused on creating high-performance chips that can support various applications, including artificial intelligence (AI), big data analysis, and online transactions.
One of T-Head's notable products is the Zhenyue 510, a controller chip designed for enterprise solid-state drives (SSDs). Launched at Alibaba's Apsara cloud computing conference, this chip promises to enhance performance in Alibaba Cloud's data centers by providing a 30% reduction in latency for input and output operations compared to existing solutions. This innovation is critical as it allows Alibaba to optimize its cloud services and improve efficiency in handling large-scale data processing tasks.
As China continues to navigate restrictions on U.S. technology, T-Head's focus on RISC-V represents a strategic move towards potential greater independence in chip design and manufacturing.
What we know is that Alibaba has taken a holistic approach in its AI strategy. It encompasses a comprehensive technology stack and has positioned itself as a key player in the ecosystem, which are all key foundations to further propel the Qwen models significantly. Built on a foundation of infrastructure-level scalability, down to the chip level, Qwen models are designed to support diverse applications across Alibaba's extensive e-commerce, app, and investment ecosystem. This strategic focus not only enhances the models' capabilities but also ensures their relevance and effectiveness in various enterprise-driven use cases that prioritize accuracy and minimize model hallucinations. It has successfully positioned itself as one of the most important players, if not, THE MOST IMPORTANT, in China’s AI ecosystem.
Original AI compositions. A delightful share.