Adobe is under scrutiny after facing a proposed class-action lawsuit, filed by author Elizabeth Lyon in Oregon, for allegedly using copyrighted books as training material for its AI program SlimLM without securing proper permissions. This lawsuit highlights a growing concern in the tech industry about generative AI systems and their reliance on datasets that might inadvertently include copyrighted works.
Let’s dive into the details and implications of this case and how it resonates with broader trends in AI and copyright disputes.
The Allegations Against Adobe
The center of the lawsuit lies in Adobe’s alleged use of pirated content in the training dataset SlimPajama-627B, which, according to Lyon, was derived from another dataset known as RedPajama, including the controversial Books3 collection. Books3, a database containing approximately 191,000 titles, has recently received attention for its role in training generative AI systems. Authors allege it includes their works without credit, payment, or consent.
Adobe has claimed that SlimLM relied on SlimPajama, which it referred to as an open-source dataset released by Cerebras in 2023. However, the case has raised questions about the larger dataset's lineage, sparking debate around AI ethics and copyright compliance.
Broader Trends: AI Copyright Lawsuits
Adobe isn't the first tech company to come under fire for alleged copyright violations in AI training models. Major tech players have faced similar accusations.
Notable recent cases:
- Anthropic: Settled for $1.5 billion with authors claiming its chatbot Claude used unauthorized books for training.
- Apple: Accused of using Books3 for its AI models.
- Salesforce: Sued for incorporating RedPajama in its AI processes.
These lawsuits underscore the growing friction between copyright holders and companies building AI systems. As generative AI develops, creators from various industries are raising concerns about how their intellectual works are used, and monetized, without their involvement.
Impacts for Entrepreneurs
If you're a startup founder in Europe, particularly focused on AI or any software development field, cases like Adobe's serve as cautionary tales. The intersecting lines of copyright laws and AI require careful navigation. For entrepreneurs bootstrapping their ideas with limited resources, overlooking ethical data usage could mean facing financial liabilities that could derail their projects.
Lessons to Learn
Here are some insights to keep in mind:
- Understand the sources of your datasets: If you’re using external data for training purposes, ensure every element of the dataset aligns with copyright regulations.
- Establish transparency: Make clear agreements about licensing and permissions, even when using supposedly open-source datasets.
- Implement a process for audits: Regularly review and validate the data you’re using in practice.
Common Mistakes Entrepreneurs Should Avoid
- Relying on loosely defined 'open-source' datasets: Some datasets marketed as open-source include copyrighted materials. Always verify.
- Ignoring legal consultations: Skipping expert advice on copyright laws can lead to costly mistakes.
- Positioning AI without ethical communication: Clearly explain how your AI product sources and transforms data; transparent messaging builds trust with users and minimizes backlash.
Practical Guidance for Entrepreneurs
If you’re leading a business that involves AI or data manipulation, here’s a quick how-to guide for keeping your operations compliant:
Step 1: Vet your data sources rigorously
Explore trusted platforms and providers for curated datasets. Avoid scraping or downloading datasets without clear permissions, even from publicly available sources.
Step 2: Build partnerships with libraries or content owners
Many creators are more open to cooperation than litigation. Reach out to authors, publishers, or content owners directly for licensing agreements.
Step 3: Prioritize ethical branding
Being proactive about ethical principles in your product's development can set you apart. This not only mitigates risks but also creates goodwill among potential stakeholders.
Female Entrepreneurs’ Viewpoint
In my years of building startups, especially as someone bootstrapping businesses across Europe, this case reveals something I’ve been advocating all along: setting ethical boundaries when innovation blurs into gray areas. Entrepreneurs, especially in AI and tech-skewed fields, often hear about pushing limits to achieve results. But as these lawsuits show, there’s a fine line between leveraging creativity and overstepping your reach into others’ work.
Moreover, women in tech already face nuanced struggles establishing themselves in such industries. Adding unnecessary legal challenges by ignoring copyright compliance only compounds these challenges. Ethical decision-making combined with smart business tactics ensures fewer setbacks and a stronger foundation for growth.
Final Thoughts
The proposed lawsuit against Adobe serves as a reminder to every business innovating with AI or datasets: the future of your company lies not just in the technology you build but in how responsibly you build it. Whether you’re creating prototypes or scaling up an operation, accountability should become your practiced discipline.
Start by turning this case into a teachable moment. Whether you're a startup founder in Amsterdam or a freelancer in Cologne, taking ownership of ethical principles goes beyond legal protections, it’s the foundation of sustainable growth.
Let’s face it: Change doesn’t start after the lawsuit. It starts the day you write your mission down on paper.
FAQ
1. What is the lawsuit against Adobe about?
Adobe is facing a proposed class-action lawsuit filed by author Elizabeth Lyon, alleging it used copyrighted books without permission to train its AI program, SlimLM. The issue centers around the inclusion of datasets like SlimPajama and Books3 that contain copyrighted works. Learn more about Adobe's lawsuit details
2. Why is the Books3 dataset controversial?
Books3 comprises approximately 191,000 books and has been widely used to train AI systems. However, many of its contents are alleged to be pirated, raising copyright infringement concerns. Explore the controversy around Books3
3. What other companies have been involved in similar lawsuits?
Tech firms like Anthropic, Apple, and Salesforce have faced lawsuits accusing them of using copyrighted content for AI training. Anthropic even settled for $1.5 billion. Learn more about Anthropic's settlement | Details on Apple's lawsuit
4. What are the allegations regarding the SlimPajama dataset?
The SlimPajama dataset, reportedly used to train Adobe's SlimLM, is alleged to be derived from RedPajama, which includes content from the controversial Books3 dataset. Check out SlimPajama dataset allegations
5. Has Adobe responded to these claims?
Adobe has denied the allegations, stating that SlimLM was trained using SlimPajama-627B, an open-source dataset released by Cerebras in 2023. Learn more about Adobe’s response
6. What is the significance of this lawsuit for the tech industry?
The case highlights concerns about how generative AI systems are trained using potentially copyrighted datasets. It raises broader ethical and legal questions surrounding AI development. Discover implications of AI copyright lawsuits
7. How can startups avoid similar issues?
Entrepreneurs can mitigate risks by carefully vetting their datasets, seeking proper permissions, and maintaining transparency in data sourcing. Explore ethical AI practices
8. What role does transparency play in AI development?
Transparency in data sourcing and usage builds trust with consumers and stakeholders while minimizing legal risks. Clear communication about data practices is essential for ethical AI branding.
9. How has RedPajama been implicated in copyright issues?
RedPajama, an open-source dataset used by various tech companies, is reportedly derived from Books3, which includes pirated content. Its use has led to multiple lawsuits, including some against Salesforce. Learn about cases involving RedPajama
10. What does the lawsuit mean for AI development trends?
Emerging legal cases are pushing the tech industry to address ethical and legal aspects of AI training datasets. The Adobe lawsuit could set a precedent for future copyright disputes in AI. Read more about AI copyright trends
About the Author
Violetta Bonenkamp, also known as MeanCEO, is an experienced startup founder with an impressive educational background including an MBA and four other higher education degrees. She has over 20 years of work experience across multiple countries, including 5 years as a solopreneur and serial entrepreneur. Throughout her startup experience she has applied for multiple startup grants at the EU level, in the Netherlands and Malta, and her startups received quite a few of those. She’s been living, studying and working in many countries around the globe and her extensive multicultural experience has influenced her immensely.
Violetta Bonenkamp's expertise in CAD sector, IP protection and blockchain
Violetta Bonenkamp is recognized as a multidisciplinary expert with significant achievements in the CAD sector, intellectual property (IP) protection, and blockchain technology.
CAD Sector:
- Violetta is the CEO and co-founder of CADChain, a deep tech startup focused on developing IP management software specifically for CAD (Computer-Aided Design) data. CADChain addresses the lack of industry standards for CAD data protection and sharing, using innovative technology to secure and manage design data.
- She has led the company since its inception in 2018, overseeing R&D, PR, and business development, and driving the creation of products for platforms such as Autodesk Inventor, Blender, and SolidWorks.
- Her leadership has been instrumental in scaling CADChain from a small team to a significant player in the deeptech space, with a diverse, international team.
IP Protection:
- Violetta has built deep expertise in intellectual property, combining academic training with practical startup experience. She has taken specialized courses in IP from institutions like WIPO and the EU IPO.
- She is known for sharing actionable strategies for startup IP protection, leveraging both legal and technological approaches, and has published guides and content on this topic for the entrepreneurial community.
- Her work at CADChain directly addresses the need for robust IP protection in the engineering and design industries, integrating cybersecurity and compliance measures to safeguard digital assets.
Blockchain:
- Violetta’s entry into the blockchain sector began with the founding of CADChain, which uses blockchain as a core technology for securing and managing CAD data.
- She holds several certifications in blockchain and has participated in major hackathons and policy forums, such as the OECD Global Blockchain Policy Forum.
- Her expertise extends to applying blockchain for IP management, ensuring data integrity, traceability, and secure sharing in the CAD industry.
Violetta is a true multiple specialist who has built expertise in Linguistics, Education, Business Management, Blockchain, Entrepreneurship, Intellectual Property, Game Design, AI, SEO, Digital Marketing, cyber security and zero code automations. Her extensive educational journey includes a Master of Arts in Linguistics and Education, an Advanced Master in Linguistics from Belgium (2006-2007), an MBA from Blekinge Institute of Technology in Sweden (2006-2008), and an Erasmus Mundus joint program European Master of Higher Education from universities in Norway, Finland, and Portugal (2009).
She is the founder of Fe/male Switch, a startup game that encourages women to enter STEM fields, and also leads CADChain, and multiple other projects like the Directory of 1,000 Startup Cities with a proprietary MeanCEO Index that ranks cities for female entrepreneurs. Violetta created the "gamepreneurship" methodology, which forms the scientific basis of her startup game. She also builds a lot of SEO tools for startups. Her achievements include being named one of the top 100 women in Europe by EU Startups in 2022 and being nominated for Impact Person of the year at the Dutch Blockchain Week. She is an author with Sifted and a speaker at different Universities. Recently she published a book on Startup Idea Validation the right way: from zero to first customers and beyond, launched a Directory of 1,500+ websites for startups to list themselves in order to gain traction and build backlinks and is building MELA AI to help local restaurants in Malta get more visibility online.
For the past several years Violetta has been living between the Netherlands and Malta, while also regularly traveling to different destinations around the globe, usually due to her entrepreneurial activities. This has led her to start writing about different locations and amenities from the POV of an entrepreneur. Here’s her recent article about the best hotels in Italy to work from.
About the Publication
Fe/male Switch is an innovative startup platform designed to empower women entrepreneurs through an immersive, game-like experience. Founded in 2020 during the pandemic "without any funding and without any code," this non-profit initiative has evolved into a comprehensive educational tool for aspiring female entrepreneurs.The platform was co-founded by Violetta Shishkina-Bonenkamp, who serves as CEO and one of the lead authors of the Startup News branch.
Mission and Purpose
Fe/male Switch Foundation was created to address the gender gap in the tech and entrepreneurship space. The platform aims to skill-up future female tech leaders and empower them to create resilient and innovative tech startups through what they call "gamepreneurship". By putting players in a virtual startup village where they must survive and thrive, the startup game allows women to test their entrepreneurial abilities without financial risk.
Key Features
The platform offers a unique blend of news, resources,learning, networking, and practical application within a supportive, female-focused environment:
- Skill Lab: Micro-modules covering essential startup skills
- Virtual Startup Building: Create or join startups and tackle real-world challenges
- AI Co-founder (PlayPal): Guides users through the startup process
- SANDBOX: A testing environment for idea validation before launch
- Wellness Integration: Virtual activities to balance work and self-care
- Marketplace: Buy or sell expert sessions and tutorials
Impact and Growth
Since its inception, Fe/male Switch has shown impressive growth:
- 5,000+ female entrepreneurs in the community
- 100+ startup tools built
- 5,000+ pieces of articles and news written
- 1,000 unique business ideas for women created
Partnerships
Fe/male Switch has formed strategic partnerships to enhance its offerings. In January 2022, it teamed up with global website builder Tilda to provide free access to website building tools and mentorship services for Fe/male Switch participants.
Recognition
Fe/male Switch has received media attention for its innovative approach to closing the gender gap in tech entrepreneurship. The platform has been featured in various publications highlighting its unique "play to learn and earn" model.


