Sources & Methods
Methods — how & why
Transparency is the point. Part of learning to work with AI is learning to make it accountable — knowing where a claim comes from, why a source was chosen, and what the model was told to do with it. I want this site to model that, not just talk about it.
Every external source that informs content here gets logged in extra-sources.json with an annotation explaining what it contributed. The CLAUDE.md file that governs how the AI assistant working on this site behaves has explicit sourcing rules — report before recording, Ruben decides what gets added. The methodology is in the tooling, not just stated.
All practitioners, articles, case law, and tools referenced across the courses are listed below. Short quotes are used under fair use; follow each link for the full work.
Practitioners
Agent MindsetReal-world practitioners whose workflows and writing are cited in the Agent Mindset course lessons.
Simon Willison
Creator of Datasette; long-time open-source developer who writes extensively about coding with LLMs
“A computer can never be held accountable. That's your job as the human in the loop.”
Andrej Karpathy
Former Director of AI at Tesla; OpenAI founding member; creator of the Neural Networks: Zero to Hero lecture series
“The model has read a large chunk of the internet. It doesn't remember facts the way you do — it has learned statistical patterns across trillions of words. The training data is everything: what it knows, what it doesn't, and where it will confidently confabulate.”
Long Ouyang et al. (OpenAI)
Research team at OpenAI; lead authors of InstructGPT, the paper that established RLHF as the standard approach to aligning large language models
“Our results show that fine-tuning with human feedback significantly improves outputs on a wide range of tasks — and that labelers strongly prefer InstructGPT outputs over those of GPT-3, despite InstructGPT having 100x fewer parameters.”
DeepSeek AI Research Team
Chinese AI research lab; authors of DeepSeek-R1, an open-source reasoning model that matched frontier closed-model performance using RL training on verifiable tasks
“We find that through purely reinforcement learning training, without any supervised chain-of-thought demonstrations, the model spontaneously develops sophisticated reasoning behaviours — including self-verification, backtracking, and extended deliberation before answering.”
Harvey
PowerUserPlatform documentation, third-party reviews, and press coverage referenced in the PowerUser course.
Harvey Platform Overview
Harvey AI
Official overview of Harvey's legal AI capabilities — document drafting, contract analysis, legal research, and matter summarisation.
Harvey AI
Harvey AI
Harvey's product home — enterprise AI built specifically for law, tax, and professional services firms.
Harvey: Most Innovative Companies 2026
Fast Company
Fast Company's profile of Harvey as one of the most innovative companies of 2026, covering its enterprise growth and legal AI positioning.
An Overview of Harvey AI's Features for Lawyers
Minnesota State Bar Association
Bar association overview of Harvey's feature set written for practising lawyers — useful for grounding capability expectations against professional standards.
Harvey AI Review
Tools for Humans
Independent practitioner review covering Harvey's strengths, limitations, and workflow integration from a user perspective.
Legora
PowerUserOfficial resources and independent coverage of Legora referenced in the PowerUser course.
Legora for Law Firms
Legora
Legora's solution page for law firms — document intelligence, workflow automation, and due diligence tooling.
Legora
Legora
Legora's product home — legal AI platform with a strong focus on Nordic and European jurisdictions and data residency.
Choosing the Right Legal AI Solution: A Practical Guide
Legora
A practical framework for evaluating legal AI tools, covering jurisdiction fit, data governance, accuracy benchmarks, and hallucination risk.
Legal ethics & case law
PowerUserProfessional conduct rules and landmark cases that define how lawyers must use AI responsibly.
ABA Model Rules of Professional Conduct — Rules 1.1, 1.6, 5.1 & 5.3
American Bar Association
The foundational professional conduct rules underpinning responsible legal AI use: Rule 1.1 Comment 8 (technology competence), Rule 1.6 (confidentiality of client data), and Rules 5.1 & 5.3 (supervision of lawyers and non-lawyer assistants, including AI tools).
Mata v. Avianca, Inc., No. 22-cv-1461 (PKC) (S.D.N.Y. June 22, 2023)
United States District Court, S.D.N.Y.
Landmark 2023 case in which counsel submitted AI-generated briefs citing non-existent cases. The court sanctioned the attorneys — establishing binding precedent that lawyers bear full accountability for AI-generated content they file.
Academic & technical foundations
PowerUserKey research underpinning how AI language models and retrieval systems work.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Patrick Lewis et al. (2020)
The foundational paper introducing RAG — the technique that lets AI models ground responses in retrieved documents rather than training memory alone. Core to how legal AI tools reduce hallucination when working with specific source material.
Empirical research on AI productivity
PowerUserAcademic empirical studies, systematic reviews, benchmarks, and legal-engineering methodology sources measuring productivity gains, quality effects, hallucination risk, workflow design, and verification in AI-supplemented legal and professional work.
Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality
Dell'Acqua, F., McFowland III, E., Mollick, E. R. et al. (Harvard Business School / BCG, 2023)
The landmark RCT on professional AI use — 758 BCG consultants, 18 tasks. Source for the 'jagged frontier' concept: 25% faster, 40% higher quality inside the frontier; 19pp worse outside it. Foundation for the Research Evidence section.
Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence
Noy, S. & Zhang, W. (MIT, 2023) — published in Science
Preregistered RCT with 444 professionals on writing tasks. Source for the 40% speed gain, 18% quality improvement, and skill-compression findings in the Research Evidence section.
Lawyering in the Age of Artificial Intelligence
Choi, J. H., Monahan, A. & Schwarcz, D. (Minnesota Law Review Vol. 109, 2024)
First legal-specific RCT: law students with/without GPT-4 on realistic legal tasks. Source for the consistent speed gains / uneven quality improvement finding and the junior-lawyer uplift pattern.
AI-Powered Lawyering: AI Reasoning Models, Retrieval Augmented Generation, and the Future of Legal Practice
Schwarcz, D., Manning, S., Prescott, J. J. et al. (Journal of Law and Empirical Analysis, 2026)
Primary source for the 2026 legal RCT showing quality gains from o1-preview and Vincent AI, fewer hallucinated citations with the RAG-grounded tool, and task-specific quality effects.
Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis
Chen, B. M. & Bao, H. (2026)
Source for the claim that untrained LLM access can be counterproductive in legal analysis, while brief training improved adoption and scores.
Artificial Intelligence and Human Legal Reasoning
Bednar, N., Cleveland, D. R., Erbsen, A. & Schwarcz, D. (2026)
Source for the workflow-placement claim: AI helped early legal synthesis without reducing later comprehension, but AI revision helped weaker memos while degrading stronger ones.
LegalCheck: Retrieval- and Context-Augmented Generation for Drafting Municipal Legal Advice Letters
van der Meer, V. & Rossi, J. (ICAIL 2026)
Deployment study supporting the legal-engineering claim that curated legal knowledge bases, controlled prompting, and expert-in-the-loop review can produce near-final legal drafts in a bounded workflow.
Reimagining Legal Fact Verification with GenAI: Toward Effective Human-AI Collaboration
Han, S., Zhang, Y., Huang, Y. et al. (CHI 2026)
Interview study supporting the claim that legal AI fact-verification workflows require auditability, accountability, confidentiality controls, and human legal judgment.
Benchmarking Legal RAG: The Promise and Limits of AI Statutory Surveys
Afane, M., Hariri, E., Ouyang, D. & Ho, D. E. (ACM CS&Law 2026)
Benchmark source for the claim that specialized statutory RAG and legal error analysis can outperform generic or commercial legal AI setups, but retrieval and reasoning failures remain material.
Legal RAG Bench: an end-to-end benchmark for legal RAG
Butler, A.-R. & Butler, U. (2026)
Benchmark source for the claim that retrieval quality sets the ceiling for many legal RAG workflows and that groundedness must be evaluated separately from answer fluency.
Generative AI in public administration: A quasi-experimental analysis of bureaucratic productivity
Kim, E. (Government Information Quarterly, 2026)
Quasi-experimental source for the claim that specialized GenAI can reduce task-level drafting time in rule-bound public-sector workflows, especially for newer employees.
Generative AI and labour productivity: A quasi experiment on coding
Gambacorta, L., Qiu, H., Shan, S. & Rees, D. M. (Journal of Financial Stability, 2026)
Source for the measurement caution that AI can increase output volume more than useful task completion, making workflow-level productivity metrics essential.
Generative AI and Worker Productivity: A Systematic Review and Quantitative Evidence Synthesis (2023-2026)
Singh, H. V. (Indian Institute of Management Bangalore, 2026)
Systematic review source for the cross-study synthesis that task-level AI productivity gains are real but heterogeneous by task, expertise, and measurement method.
LawFlow: Collecting and Simulating Lawyers' Thought Processes
Das, S. et al. (Microsoft Research / arXiv, 2025)
Source for the legal-engineering claim that real legal work is an adaptive workflow with decision points and review loops, not just isolated answer generation.
An Uncommon Task: Participatory Design in Legal AI
Delgado, F., Barocas, S. & Levy, K. (Proceedings of the ACM on Human-Computer Interaction, 2022)
Methodology source for the claim that legal AI evaluation and tool design benefit from participatory methods where lawyers and technologists co-design tasks, simulations, and criteria.
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
Guha, N. et al. (2023)
Benchmark source for the evaluation-rubric claim: legal AI should be tested against legal task taxonomies rather than generic model capability claims.
LegalBench-RAG: A Benchmark for Retrieval-Augmented Generation in the Legal Domain
Pipitone, N. & Alami, G. (2024)
Benchmark source for the retrieval-grounding claim that legal RAG needs expert-annotated relevant passages and legal-domain groundedness checks.
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
Dahl, M. et al. (2024)
Source for the verification-risk claim that legal hallucinations remain a core failure mode requiring citation checks and human legal review.
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
Magesh, V. et al. (Stanford HAI / RegLab, 2024)
Source for the claim that even legal RAG and legal research tools require human verification because hallucinations and unsupported legal propositions can persist.
Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review
Grossman, M. R. & Cormack, G. V. (Richmond Journal of Law & Technology, 2011)
Foundational empirical paper establishing that TAR matches or exceeds exhaustive manual review recall at dramatically lower cost. Kept as background context for validated, auditable legal automation.
Comparing the Performance of Artificial Intelligence to Human Lawyers in the Review of Standard Business Contracts
LawGeex / Bucerius Law School (2018)
Vendor-funded controlled comparison retained as supporting context for narrow contract-review automation claims; no longer used as a headline academic evidence card.
Future of Professionals Report 2024
Thomson Reuters Institute (2024)
Industry survey retained as adoption context for legal and compliance professionals; not used as primary academic evidence for productivity or quality claims.
Another New Study of Legal AI Shows Some Models Can Significantly Improve Work Quality and Efficiency
Bob Ambrogi / LawNext (2025)
Practitioner coverage retained as context for how legal AI RCT findings were surfaced in the legal industry; primary claims now use the 2026 journal article directly.
Legal AI news
PowerUserPublications and outlets tracked daily for legal AI tool launches, court rulings, adoption data, and market developments.
LawSites (LawNext) by Bob Ambrogi
Robert Ambrogi
The most thorough independent tracker of legal technology developments. Primary source for the Recent Developments feed — covering tool launches, court decisions, and adoption data with detailed analysis.
Artificial Lawyer
Artificial Lawyer
Legal AI industry publication covering tool launches, market developments, and technology trends. Source for Microsoft Legal Agent and OpenAI legal vertical coverage in the developments feed.
2026 Legal Industry AI Adoption Report
8am
Survey of 1,300+ legal professionals conducted in late 2025. Found individual AI adoption doubled to 69% in a year while institutional governance (policies, training) significantly lagged — the governance gap entry in the developments feed.
Legal AI's Next Phase: Built With Lawyers, Measured in Practice
National Law Review
Source for the hallucination incidents data: 1,348 cases catalogued worldwide as of April 2026, growing from ~2/week to ~2-3/day. Covers the shift in legal AI toward reasoning-based approaches.
Florida Supreme Court: Amendment to Rule 2.515(d)(2) — AI Citation Certification
The Florida Bar / Florida Supreme Court
Effective June 15 2026, requires all signers of Florida court filings to certify that cited legal authorities exist and are accurately cited. Direct regulatory response to AI hallucination incidents in court submissions.
National Law Review
National Law Review
Legal news outlet covering AI regulation, court rulings, and compliance developments. Monitored as part of the daily digest.
EU AI regulation
PowerUserOfficial EU sources, independent trackers, and law firm commentary on the AI Act, GDPR enforcement, and European AI governance.
European Commission — AI Act
European Commission
Official EU source for AI Act updates, implementation guidance, and enforcement timelines. Primary regulatory reference for EU AI coverage in the digest.
EU AI Act Tracker
Future of Life Institute
Independent tracker of the EU AI Act's progress, obligations by risk tier, and implementation deadlines. Useful for quickly checking where a specific article or obligation stands.
European Data Protection Board (EDPB)
EDPB
EDPB guidance and enforcement actions on AI and GDPR intersections — transparency obligations, data subject rights, and coordinated enforcement. Monitored for AI-specific opinions and decisions.
EURACTIV
EURACTIV
EU policy news outlet with dedicated AI coverage. Strong on legislative process, member-state positions, and Brussels negotiations around the AI Act and digital regulation.
AlgorithmWatch
AlgorithmWatch
Investigative and policy outlet tracking algorithmic accountability and AI governance in Europe. Covers enforcement gaps, civil society responses, and high-risk AI system incidents.
Bird & Bird AI Insights
Bird & Bird
Law firm insights on EU and global AI regulation — practical compliance analysis written for practitioners. Useful for understanding legal obligations in plain terms.
Frontier model labs
PowerUserOfficial news and blogs from the major foundation model providers — monitored for model releases, pricing changes, and capability updates.
Anthropic News
Anthropic
Official Anthropic announcements covering model releases, safety research, and product updates. Primary source for Claude-related developments in the digest.
OpenAI News
OpenAI
Official OpenAI announcements covering model releases, API changes, and product launches. Monitored for frontier model developments relevant to legal and enterprise AI.
Google DeepMind Blog
Google DeepMind
Google DeepMind's research and product announcements, including Gemini model releases. Monitored for capability and pricing developments.
Mistral AI News
Mistral AI
Mistral's model and product announcements. Relevant for open-weight model releases and European frontier AI developments.
Meta AI Blog
Meta AI
Meta's AI research and product updates, including Llama model releases. Monitored for open-weight model developments and inference ecosystem changes.
AI coding tools
PowerUserChangelogs, blogs, and practitioners covering AI-assisted coding workflows, agents, and developer tooling.
Cursor Changelog
Anysphere / Cursor
Release notes and feature updates for the Cursor AI code editor. Monitored for prompting and agentic coding workflow developments.
GitHub Blog
GitHub
GitHub's official blog covering Copilot updates, Actions features, and developer AI tools. Monitored for AI coding workflow and agent developments.
Cognition Blog (Devin)
Cognition AI
Cognition AI's blog covering Devin — the autonomous software engineering agent. Relevant for understanding agentic coding patterns and capabilities.
Prompting & coding tips
PowerUserCommunities and writers where practical prompting techniques and LLM workflow patterns surface and get stress-tested.
Hacker News
Y Combinator
Tech community aggregator where popular prompting techniques and LLM workflow threads frequently surface. Front page and search reliably capture what practitioners are finding useful.
r/ClaudeAI
Community for Claude users sharing prompting techniques, workflows, and tips. High-upvote threads reliably surface practical insights that haven't made it to formal writeups yet.
r/LocalLLaMA
Community focused on running and prompting open-weight models locally. Techniques discussed here often generalise to hosted models. Strong signal for prompting patterns and inference optimisations.
Latent Space
swyx & Alessio Fanelli
Podcast and newsletter covering AI engineering, prompting research, and practitioner workflows. Surfaces Twitter/X discourse and translates it into structured analysis.
Andrej Karpathy — Blog & Posts
Andrej Karpathy
Karpathy's writing and social posts on LLM behaviour, prompting intuitions, and AI learning. His observations on X regularly generate high-signal discussion worth tracking.
Digest aggregators
PowerUserCatch-all aggregators monitored daily to surface AI developments that may not appear in category-specific sources.
Llm Deep Dive
PowerUserDeep Dive into LLMs like ChatGPT
Andrej Karpathy
The primary reference for the LLM Deep Dive module. Karpathy's lecture covers the full stack from pretraining data and tokenization through the transformer architecture, alignment, and reasoning models — all without requiring a maths background.
FineWeb: Pretraining Dataset
HuggingFace
HuggingFace's open pretraining dataset and interactive demo. Shows the quality filtering pipeline used to build web-scale training data, making abstract data curation decisions concrete and explorable.
Tiktokenizer
Xenova
Interactive tokenizer visualiser. Shows in real time how different text is split into tokens by GPT-4, Llama, and other tokenizers — essential for building intuition about cost, context limits, and tokenization failure modes.
Transformer Neural Net 3D Visualiser
Brendan Bycroft
A 3D interactive visualisation of the transformer architecture showing how tokens flow through attention and feed-forward layers. Referenced in the architecture lesson as a hands-on companion.
llm.c: Let's Reproduce GPT-2
Andrej Karpathy
Karpathy's from-scratch C implementation of GPT-2 pretraining. Makes the pretraining process concrete and auditable — valuable for understanding exactly what pretraining does and how base models differ from assistant models.
The Llama 3 Herd of Models
Meta AI Research
Meta's technical report for Llama 3. One of the most detailed public accounts of a modern LLM training pipeline — pretraining data, tokenizer design, architecture choices, and alignment approach. Referenced throughout the deep dive module.
Hyperbolic — Base Model Inference
Hyperbolic
Cloud inference platform offering access to base models (pre-alignment) alongside instruction-tuned variants. Allows direct comparison of base vs. fine-tuned model behaviour — used in the pretraining lesson challenge.
Training Language Models to Follow Instructions with Human Feedback (InstructGPT)
Long Ouyang et al. (OpenAI, 2022)
The paper that established RLHF as the standard approach to LLM alignment. Demonstrated that a 1.3B parameter aligned model outperforms a 175B base model on human preference metrics — making alignment quality as important as scale.
HuggingFace Inference Playground
HuggingFace
Browser-based interface for running open models via HuggingFace's inference API. Used in the alignment lesson challenge to compare base vs. instruction-tuned model behaviour.
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek AI (2025)
The paper demonstrating that RL training on verifiable tasks produces emergent chain-of-thought reasoning without explicit demonstrations. Open-source weights enabled broad reproducibility. Central reference for the reasoning models lesson.
TogetherAI Playground
Together AI
Cloud inference playground for open models including DeepSeek-R1, Llama 3, and Mistral variants. Used in the reasoning models lesson challenge for side-by-side model comparison.
Mastering the Game of Go with Deep Neural Networks and Tree Search
David Silver et al. (DeepMind, 2016)
The AlphaGo paper showing that RL with self-play can discover superhuman strategies in Go without human demonstrations. Provides the conceptual foundation for understanding how RL training enables reasoning models to discover chain-of-thought strategies.
LM Arena
LMSYS / UC Berkeley
Human preference benchmark for language models. Real users vote on blind model comparisons, producing ELO-based rankings that reflect genuine user preference rather than academic benchmark performance.
AI News Newsletter
swyx / Lior Bar
Daily newsletter summarising significant AI research, model releases, and industry developments. Recommended in the inference ecosystem lesson as a reliable way to stay current in a fast-moving field.