OpenAI’s New Deep Research Model

Posted on February 3, 2025 by Sean Harrington

This post is brought to you by ChatGPT’s Deep Research. It produced this report after about 10 minutes of thinking and searching online (sources in the footnotes). I have also used it for a couple of fairly complex legal research queries and it produces the equivalent of an article from a treatise – they were about 10-15 pages and had 20+ footnotes. One of them was a bankruptcy question and, interestingly, it cited to Collier on Bankruptcy. I looked up the citation in Lexis and it was accurate…

You’ll notice in the following post that it seems to cite to the University of Arizona quite a lot. I would assume this is because their web presence is either highly SEO optimized (easy to find) or they do not have a robots.txt in their footer (yet) but I’m interested to hear if you have a different hypothesis.

Deepseek Output:

Great, I will conduct a comprehensive review of how AI is transforming U.S. law schools from 2023-2025. This will include changes in teaching methods, faculty and law librarian practices, and collection development in law libraries. I’ll draw from trusted sources such as LawNext, GeekLawBlog, Artificial Judgment, Bespacific, Legal Technology, and the ABA’s tech section. The blog post will be structured with clear headings and span approximately three pages.

I’ll notify you when the research is complete.

How AI is Reshaping U.S. Law Schools (2023–2025)

Artificial intelligence has rapidly moved from science fiction into the everyday realities of legal education. Since the emergence of powerful generative AI tools like OpenAI’s ChatGPT in late 2022, U.S. law schools have been grappling with how to adapt. Between 2023 and 2025, AI has begun to reshape law schools in profound ways – from what and how students learn, to how professors and librarians do their jobs, to the very makeup of library resources. Below, we explore these changes in detail.

Transforming Legal Education with AI

Curriculum Integration: Law schools are increasingly weaving AI into the curriculum. A survey by the ABA’s Task Force on Law and Artificial Intelligence in early 2024 found that 55% of responding schools offer courses dedicated to AI, and 83% offer some opportunities (like clinics or projects) for students to learn to use AI tools (Recent Reports of Law Schools’ AI Adoption Have Been Greatly Exaggerated | LawSites) (3 Keys for Law Students Using AI) (It’s worth noting the survey had a low response rate, so the true numbers across all schools are likely lower (Recent Reports of Law Schools’ AI Adoption Have Been Greatly Exaggerated | LawSites) ) Even so, the trend is clear: a growing number of law schools now teach about AI and law. By mid-2024, an independent tally found at least 62 U.S. law schools (roughly 31%) offering courses on AI and the law, with nearly 100 AI-related law courses between them (Remember that ABA Survey of Law Schools with AI Classes? This May Be A More Accurate List | LawSites) (Remember that ABA Survey of Law Schools with AI Classes? This May Be A More Accurate List | LawSites) This represents a sharp increase from just a few years ago when such courses were rare. In a high-profile move, Case Western Reserve University School of Law announced in 2025 that all first-year students must complete a certification in legal AI. Their new mandatory program, “Introduction to AI and the Law,” gives 1Ls hands-on experience with AI-powered legal tools and explores the evolving landscape of AI regulation and ethics (Case Western Reserve University School of Law to require legal AI education certification | beSpacific) As Case Western’s associate dean Avidan Cover put it, “By integrating AI education into the foundational curriculum, we are… empowering our students with the tools and knowledge they need to navigate and lead in a technology-driven profession” (Case Western Reserve University School of Law to require legal AI education certification | beSpacific) These steps signal a recognition that tomorrow’s lawyers must understand AI’s role in law.

Teaching Methodologies: Alongside new courses, professors are experimenting with AI-integrated teaching methods. Educators are realizing that students need practical skills to work with AI. Some law professors now design assignments where students collaborate with or critique AI. For example, instructors have reported class exercises where students use a tool like ChatGPT to draft a portion of a memo or brief, then have to critically evaluate and improve the AI’s work (Teaching Law In The Age Of Generative AI) Others have students “role-play with chatbots” – treating an AI as a fake client or opposing counsel in simulations – to sharpen interviewing and counseling skills (Teaching Law In The Age Of Generative AI) These innovative exercises teach students how to use AI as a helper while also spotting its weaknesses. “AI is already having a significant impact on legal education and is likely to result in additional changes in the years ahead,” noted the ABA’s report, which observed that law schools are evolving their pedagogy to meet the demands of a tech-shaped profession (3 Keys for Law Students Using AI) Professors are also increasingly discussing the ethics and pitfalls of AI in class. For instance, students may be asked to debate the responsible use of generative AI in practice – covering issues like bias, confidentiality, and the unreliability of “hallucinated” (fabricated) citations or facts.

Student Assessments and Academic Integrity: The rise of AI has forced law schools to rethink exams and evaluations. Early on, there was panic that ChatGPT would be “the death of the essay” – that students could simply have a bot write their papers or exam answers (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog) In response, many schools rushed to set policies on AI-assisted work. A Kaplan survey found that by late 2024 about 45% of law schools explicitly banned generative AI for writing admissions essays, while only 1% officially allowed it (the rest had no formal policy) ( This law school is among the 1% that allow use of AI to write admissions essays ) ( This law school is among the 1% that allow use of AI to write admissions essays ) Some law professors similarly banned AI use on assignments unless disclosed, treating undisclosed use as plagiarism. However, as faculty gained experience, outright panic has given way to a more measured approach. Notably, professors at University of Minnesota Law School tested ChatGPT on four real law exams in 2023 – and the AI averaged a C+ grade (around the bottom 20% of the class) (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog) On one hand, ChatGPT’s answers showed *“a strong grasp of basic legal rules” and solid organization, mimicking a competent but not exceptional student (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog) * On the other hand, when questions required deeper analysis or reference to specific cases, ChatGPT often failed spectacularly, sometimes earning the worst score in the class (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog) These findings reassured faculty that current AI isn’t (yet) acing law school. As one legal writing professor noted, AI-generated essays tend to be *superficial – correct on black-letter law, but poor at issue-spotting and applying law to facts in nuanced ways (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog) . Knowing this, some instructors now design assessments that emphasize critical thinking and personalized analysis that AI can’t easily mimic. Others allow limited use of AI for preliminary drafting or idea-generation, so long as students cite it and then significantly edit the work. The overarching goal is to maintain academic integrity while also recognizing that learning to supervise AI might be a valuable skill in itself. In fact, a growing view is that students should graduate with experience in using AI appropriately rather than with a complete prohibition. As one law dean remarked, a “material number of law schools are responding aggressively” to AI’s rise, viewing these early adaptations as “a sign of what’s to come in terms of legal education” (Recent Reports of Law Schools’ AI Adoption Have Been Greatly Exaggerated | LawSites)

Faculty and Law Librarians Adapt to AI

It’s not just students adjusting – professors and law librarians are also leveraging AI to streamline their work and enhance their scholarship. Law faculty have discovered that generative AI can assist with some of their most time-consuming tasks. For instance, professors in various disciplines have used tools like ChatGPT as a research assistant – generating initial literature reviews, summarizing cases or articles, and even brainstorming novel legal arguments. In one headline-grabbing experiment, a U.S. law professor (Andrew Perlman of Suffolk University Law School) co-wrote an entire law review article with ChatGPT in 2024, asking the AI to “develop a novel conception of the future of legal scholarship.” The resulting paper – with the AI’s contributions clearly marked – argued that AI will expand the scale and scope of academic legal writing, allowing scholars to iterate ideas faster (Recent Reports of Law Schools’ AI Adoption Have Been Greatly Exaggerated | LawSites) This “AI-written” article was deliberately provocative, but it underscores how some faculty are testing AI’s capabilities in scholarship. Even professors who don’t go that far are finding smaller ways to use AI day-to-day. For example, drafting and editing: a professor might have ChatGPT generate a rough syllabus paragraph or multiple-choice questions for an exam, then refine them. Others report using AI to help grade by, say, inputting a model answer and a student answer and asking the AI to compare – though due to reliability concerns, AI isn’t replacing human graders, just assisting as a second pair of eyes.

Faculty are also harnessing AI for administrative efficiency. Generative AI can quickly produce first drafts of routine emails, recommendation letters, or research grant applications, which professors can then personalize. This helps cut down on rote writing tasks. Importantly, professors approach these uses with caution – verifying facts and ensuring any AI-generated text is accurate and in their own voice. As one might expect, tech-savvy law professors have also started incorporating AI topics into their research agendas. Legal scholars are producing a surge of articles on AI’s implications for everything from intellectual property to criminal justice. In doing so, many use AI tools to analyze data (for example, using machine learning to parse large sets of court decisions). In short, the professoriate is learning to work smarter with AI, treating it as a junior research assistant that never sleeps, albeit one that requires heavy supervision.

Meanwhile, law librarians – often the vanguard of technology in legal academia – have been quick to explore AI’s potential in libraries. “Law librarians have always been early technology adopters and trainers,” notes Cas Laskowski, a research and instruction librarian at Duke, “we are proud to launch [projects] focusing on AI and how libraries can remain… equitable community anchors in our increasingly tech-driven world.” (Law Libraries Launch Initiative to Prepare for Artificial Intelligence Future | University of Arizona Law) One early and practical use of AI for librarians is summarizing and organizing information. Law librarians are skilled at finding vast amounts of information for faculty and students, but summarizing that trove can be labor-intensive. In early 2023, Greg Lambert (a law librarian and blogger) tried using GPT-3.5 to automatically summarize legal news and podcasts. He was “impressed” with the results, noting that “any tool that would help librarians synthesize information in a useful way is a welcome tool” in an environment where one librarian may support hundreds of attorneys or students (What a Law Librarian Does with AI Tools like ChatGPT – Organize and Summarize | 3 Geeks and a Law Blog) By having an AI quickly generate a concise summary of, say, a 50-page report or a set of new case law updates, librarians can save time and serve their patrons faster. Similarly, AI can help organize data – some libraries have experimented with AI to classify documents or tag resources with relevant topics, augmenting traditional cataloging.

Law librarians are also leveraging AI to assist faculty and students in legal research. As new AI-driven legal research platforms emerged (from startups and the big vendors), librarians took on the role of testers and guides. In 2023 and 2024, many law librarians ran experiments with tools like Harvey, Casetext’s CoCounsel, Westlaw Precision with AI, and Lexis+ AI to see how they perform. Their dual aim: to advise users on the tools’ effectiveness and to flag pitfalls. Notably, when LexisNexis rolled out its generative AI legal research assistant (“Lexis+ AI”), librarians and professors scrutinized its output. Early tests were not entirely reassuring – one Canadian law professor found that Lexis+ AI’s results were “riddled with mistakes”, including citing non-existent legislation and mischaracterizing cases, leading him to conclude it “should not be used by law students yet” (Canadian law professor gives Lexis+ AI “a failing grade” – LexisNexis responds – Legal IT Insider) (Canadian law professor gives Lexis+ AI “a failing grade” – LexisNexis responds – Legal IT Insider) U.S. law librarians have reported similar concerns with other AI tools “hallucinating” citations or facts. As a result, academic law librarians have been busy updating research training for students: they teach not just how to use these new AI research features, but also how to double-check AI outputs against reliable sources. In some law schools, librarians created guides and tutorials on the dos and don’ts of generative AI in legal research – emphasizing that these tools can save time only if used carefully and not as a substitute for traditional verification (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog) (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog)

Finally, on the administrative side, AI is helping law librarians behind the scenes. Some libraries are using AI-driven analytics to make decisions about collection usage and budgeting. For example, AI can analyze which databases or journals are underused (to consider cancellation) or predict what topics will be in demand, informing acquisition decisions. Library staff are also exploring AI to automate repetitive tasks like updating citation metadata, checking links, or even answering common patron questions via chatbots. All these adaptations free up librarians to focus on higher-level tasks, like one-on-one research consultations and teaching – the things that AI can’t easily replace.

AI in Law Library Collections and Research

The very collections and databases that law libraries manage are evolving thanks to AI. Not long ago, a law library’s idea of high-tech was providing online databases like Westlaw, Lexis, or HeinOnline. By 2023–2025, those databases themselves have gotten smarter, and libraries are acquiring completely new AI-based resources.

AI-Powered Research Tools: A major development has been the integration of generative AI into legal research platforms. Both Westlaw and LexisNexis launched AI-enhanced research offerings in 2023, allowing users to input natural-language questions and receive narrative answers or brief drafts alongside the usual list of sources. For example, Lexis+ AI can draft a legal memorandum based on a query, using the content of its vast database to generate an answer (with cited sources). These tools promise to save time by synthesizing cases and statutes on a given issue. Law libraries, as the gatekeepers to research services, have been evaluating these tools for inclusion in their digital collections. By late 2024, many law libraries were running pilot programs with generative AI research tools, often in partnership with the vendors. Law librarians must decide: do these tools provide reliable value for students and faculty? Given the mixed early reviews (with accuracy concerns like those noted above), libraries are adopting a cautious approach – often providing access to AI research tools on an optional basis and collecting feedback before fully integrating them into the curriculum. Still, the direction is set. As one legal tech commentator observed in 2024, “New and emerging AI technologies, such as GPT models, have the potential to reshape the legal landscape… Law schools should consider updating their curriculum to reflect the growing use of AI in the legal field.” (AI vs Law Schools: The Cost of Ignoring the Future) (AI vs Law Schools: The Cost of Ignoring the Future) In practice, this means law libraries will increasingly offer AI-driven databases alongside traditional resources, ensuring students learn how to use cutting-edge tools like brief analyzers, contract review AI, and chat-style legal Q&A systems.

Digital Collections & Knowledge Management: Beyond commercial databases, law libraries are using AI to enhance their own institutional repositories and archives. Digital collections of case law, scholarship, or historical legal materials can be made more accessible with AI. For instance, some libraries are experimenting with natural language search or AI chat interfaces for their archives, so a student could query an alumni brief bank or a faculty publications repository in plain English and get relevant results summarized. AI can also assist in transcribing and organizing audio/visual content (like recordings of lectures or court oral arguments) making them searchable text for researchers. In collection development, librarians are intrigued by AI tools that can predict research trends – if an AI analyzes millions of data points (news, scholarly articles, court dockets) and predicts a surge in, say, climate law litigation, a library might proactively acquire more in that area. While such predictive collection development is still emerging, it’s on the horizon.

Acquisitions and Budgeting: The role of AI has also reached the budgeting and acquisition strategies of law libraries. AI-driven analytics help librarians identify which resources provide the most value. By automatically compiling usage statistics and user feedback, AI can suggest which subscriptions to renew or cancel. Some law library systems use machine learning to recommend new titles based on current collection and usage patterns, akin to how e-commerce sites suggest products. This makes collection development more data-driven. Additionally, the cost of new AI tools is a factor – many of these advanced research platforms come at premium prices. Law libraries must justify these costs by demonstrating the educational value. The collaborative initiative “Future of Law Libraries: AI, Opportunities, and Advancement,” led by several law schools in 2023, is one example of libraries banding together to tackle these questions (Law Libraries Launch Initiative to Prepare for Artificial Intelligence Future | University of Arizona Law) Through a series of nationwide roundtables, law library leaders are sharing strategies on budgeting for AI and negotiating with vendors, as well as brainstorming best practices for training users on these resources (Law Libraries Launch Initiative to Prepare for Artificial Intelligence Future | University of Arizona Law) As Teresa Miguel-Stearns (director of University of Arizona’s law library) explained, “Artificial intelligence promises to revolutionize how law libraries function and provide value… Through insightful discussions, we hope to identify constructive ways law libraries can plan for and utilize AI to improve access to legal information [and] enhance legal education.” (Law Libraries Launch Initiative to Prepare for Artificial Intelligence Future | University of Arizona Law) In short, collection development policies are being updated to account for AI – both as content (buying AI tools) and as methodology (using AI to make decisions).

Maintaining Trust and Accuracy: A core mission of law libraries is to provide trustworthy legal information. Thus, librarians are carefully monitoring the accuracy of AI in research. As noted, early adopters found that generative AI legal tools sometimes generate fake citations or flawed analysis, which could mislead the unwary researcher. To protect the integrity of the library’s offerings, some law libraries have set usage guidelines for AI tools: for example, warning students that AI answers are only a starting point and must be verified against primary sources. A few libraries even require users to sign in to use certain AI tools, so they can track outcomes and intervene if needed. Publishers are responding to these concerns too – for instance, newer versions of legal AI systems are emphasizing transparency by providing hyperlinks to sources for every assertion, and some use “embedded reference checking” to alert if a cited case seems inconsistent with the proposition. Law libraries are keenly evaluating these features. The end goal is to integrate AI into collections without compromising on reliability. In the words of one law school technology director, it’s about finding ways to “leverage AI for law libraries in the future” while avoiding the mistakes of unchecked automation (Law Libraries Launch Initiative to Prepare for Artificial Intelligence Future | University of Arizona Law)

Conclusion

From 2023 to 2025, the influence of AI on U.S. law schools has grown from a novelty to an impending reality that no law school can ignore. In legal education, AI is driving changes in what is taught (more tech and AI law content), how it’s taught (new teaching methods and careful assessment policies), and what skills students graduate with (hopefully, the ability to use AI tools ethically and effectively). Faculty and librarians are not standing idle; they are adopting AI to augment their research, teaching, and administrative duties, even as they remain cautious about its limits. Law libraries, often the heart of a law school, are transforming their collections and services to include powerful AI resources, all while upholding the standards of accuracy and thoroughness that define legal research.

Crucially, this period has also seen a shift in mindset: rather than view AI only as a threat (to academic honesty or to jobs), many in legal education now see it as an opportunity. As one ABA tech report observed, ignoring AI is not an option – the future of law will be shaped by those attorneys who are competent with technology (AI vs Law Schools: The Cost of Ignoring the Future) Law schools that adapt now are positioning their graduates to be those competent attorneys. In the years ahead, we can expect AI to be as commonplace in law classrooms and libraries as Westlaw and casebooks have been for decades. The reshaping of law schools by AI is well underway, and 2023–2025 has been just the beginning of that profound evolution.

Sources:

Ambrogi, Bob. “Recent Reports of Law Schools’ AI Adoption Have Been Greatly Exaggerated.” LawNext, July 2, 2024 (Recent Reports of Law Schools’ AI Adoption Have Been Greatly Exaggerated | LawSites) (Recent Reports of Law Schools’ AI Adoption Have Been Greatly Exaggerated | LawSites)
Ambrogi, Bob. “ABA Survey of Law Schools with AI Classes – A More Accurate List.” LawNext, Aug. 1, 2024 (Remember that ABA Survey of Law Schools with AI Classes? This May Be A More Accurate List | LawSites)
Case Western Reserve Univ. School of Law. “Case Western Reserve… to require legal AI education certification.” beSpacific, Jan. 23, 2025 (Case Western Reserve University School of Law to require legal AI education certification | beSpacific) (Case Western Reserve University School of Law to require legal AI education certification | beSpacific)
Bliss, John. “Teaching Law in the Age of Generative AI.” Jurimetrics, Winter 2024 (Teaching Law In The Age Of Generative AI)
Lambert, Greg (Jennifer Wondracek & Rebecca Rich). “Experiments with ChatGPT: Don’t Panic…” 3 Geeks and a Law Blog, Jan. 30, 2023 (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog) (Experiments with ChatGPT: Don’t Panic, the Robots Are Not Writing Your Students’ Legal Memos | 3 Geeks and a Law Blog)
Lambert, Greg. “What a Law Librarian Does with AI Tools like ChatGPT.” 3 Geeks and a Law Blog, Jan. 13, 2023 (What a Law Librarian Does with AI Tools like ChatGPT – Organize and Summarize | 3 Geeks and a Law Blog)
Miguel-Stearns, Teresa, et al. “Law Libraries Launch Initiative to Prepare for AI Future.” University of Arizona Law News, Sept. 6, 2023 (Law Libraries Launch Initiative to Prepare for Artificial Intelligence Future | University of Arizona Law) (Law Libraries Launch Initiative to Prepare for Artificial Intelligence Future | University of Arizona Law)
Hill, Caroline. “Canadian law professor gives Lexis+ AI a ‘failing grade’ – LexisNexis responds.” Legal Technology News, Nov. 18, 2024 (Canadian law professor gives Lexis+ AI “a failing grade” – LexisNexis responds – Legal IT Insider)
Butalia, Manit. “AI vs Law Schools: The Cost of Ignoring the Future.” Law Technology Today (ABA), Nov. 26, 2024 (AI vs Law Schools: The Cost of Ignoring the Future) (AI vs Law Schools: The Cost of Ignoring the Future)
Cassens Weiss, Debra. “Majority of law schools have no policy on using AI for admissions essays.” ABA Journal, Oct. 9, 2024 ( This law school is among the 1% that allow use of AI to write admissions essays )

Revolutionizing Legal Education with AI: The Socratic Quizbot

Posted on December 10, 2024 by Sean Harrington

I had the pleasure of co-teaching AI and the Practice of Law with Kenton Brice last semester at OU Law. It was an incredible experience. When we met to think through how we would teach this course, we agreed on one crucial component:
We wanted the students to get a lot of reps using AI throughout the entire course.

That is fairly easy to accomplish for things like research, drafting, and general studying for the course but we hit a roadblock with the assessment component. I thought about it for a week and said, “Kenton, what if we created an AI that would Socratically quiz the students on the readings each week?” His response was, “Do you think you can do that?” I said, “I don’t know but I’ll give it a try.” 🤷‍♂️

Thus Socratic Quizbot was born. If you follow me on social media, you’ve probably seen me soliciting feedback on the paper:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4975804

December 2024 Update

Purpose

I'm no longer a teacher. I'm just a human plagiarism detector.

I used to spend my grading time giving comments for improving writing skills. Now most of that time is just checking to see if a student wrote their own paper. What a waste of life.
— Dr. Amy Clukey (@AmyClukey) October 4, 2024

A lot of the motivation for Quizbot was a new paradigm in the law school ecosystem: the take-home essay is effectively dead. In fact, lots of the typical homework that you would assign as a law school professor simply breaks when you introduce something like ChatGPT or Claude into our world. We needed to come up with new methods of assessment.

I knew that these tools were really good at ingesting documents like PDFs and then summarizing them (manipulating the text, generating based on the text, etc.). What I needed was an AI that could read our course readings and then have a back-and-forth Socratic conversation with the students about those readings, and then some method to assess those conversations so that I could give students a grade. This felt like a big task with many potential pitfalls for one guy who is only mediocre (at best) at coding and app development.

As it turned out, I was able to fumble my way through the process and create a method of assessment that students seemed to enjoy. Alright, “enjoy” is probably too strong of a word, but they tolerated it and said they liked it quite a bit more than something like a multiple-choice test or a take-home essay. The Socratic Quizbot enables you to scale cold-calling to every student in the class while eliminating much of the stress and embarrassment that law students have dreaded since time immemorial.

Since many of the people who are interested in this blog post may have already read or skimmed my article, I decided to add my update as Appendix A so that you could simply fast-forward to the portion that is relevant to you. There is also a link to the open-source code in Github.

A Brief Overview of What is in Appendix A

Click here to see the article on SSRN, with the updated Appendix A

Appendix A was born out of one question I kept getting after sharing the pre-print of my article: “How?” So, let me show you exactly how you can implement the Socratic Quizbot in your classroom, along with some insights from my students who graciously let me experiment with them.

Student Feedback, Challenges, and Improvements
Students overwhelmingly preferred this method to essays or multiple-choice quizzes, citing the flexibility to ask for clarification and control the pace of their learning. It also reduced the fear of being judged by their peers. That said, a few students tried to game the system by flipping the questions back on the bot. My grading rubric handled that, but I’d like to make Quizbot more persistent in pressing them for answers next time. I’m also excited to explore gamification—adding themes, Easter eggs, or playful interactions to make the experience even more enjoyable.

Two Ways to Get Started
If you want to try this yourself, you’ve got two paths. The no-code approach uses ChatGPT Teams and involves setting up a CustomGPT that ingests your course readings and quizzes your students. This is great if you’re looking for quick implementation. For the more tech-savvy or budget-conscious, the code-based option lets you host Quizbot locally using the instructions I’ve shared on GitHub. It takes a bit more effort but gives you total control over security and functionality. Hopefully you will see a version of Socratic Quizbot available in CALI.org in the future because I have been talking with John and Elmer and they both seem interested with integrating it into the platform (although, do not hold them to that because it’s still very-early talks).

Ultimately, my goal is to make this tool accessible for anyone in legal education. Whether you’re a tech whiz or new to AI, there’s a way to incorporate this into your classroom. And if you’re as curious about alternative assessments as I am, I’d love to hear your thoughts and ideas! The benefit of making it open in Github is that you can fork and improve my prototype. I would be deeply honored to see improvements on my little project and love to see what our community can do to improve it.

Evaluating Generative AI for Legal Research: A Benchmarking Project

Posted on May 24, 2024 by Sean Harrington

This is a post from multiple authors: Rebecca Fordon (The Ohio State University), Deborah Ginsberg (Harvard Law Library), Sean Harrington (University of Oklahoma), and Christine Park (Harvard Law Library)

In late 2023, several legal research databases and start-up competitors announced their versions of ChatGPT-like products, each professing that theirs would be the latest and greatest. Since then, law librarians have evaluated and tested these products ad hoc, offering meaningful anecdotal evidence of their experience, much of which can be found on this blog and others. However, one-time evaluations can be time-consuming and inconsistent across the board. Certain tools might work better for particular tasks or subject matters than others, and coming up with different test questions and tasks takes time that many librarians might not have in their daily schedules.

It is difficult to test Large-Language Models (LLMs) without back-end access to run evaluations. So to test the abilities of these products, librarians can use prompt engineering to figure out how to get desired results (controlling statutes, key cases, drafts of a memo, etc.). Some models are more successful than others at achieving specific results. However, as these models update and change, evaluations of their efficacy can change as well. Therefore, we plan to propose a typology of legal research tasks based on existing computer and information science scholarship and draft corresponding questions using the typology, with rubrics others can use to score the tools they use.

Although we ultimately plan to develop this project into an academic paper, we share here to solicit thoughts about our approach and connect with librarians who may have research problem samples to share.

Difficulty of Evaluating LLMs

Let’s break down some of the tough challenges with evaluating LLMs, particularly when it comes to their use in the legal field. First off, there’s this overarching issue of transparency—or rather, the lack thereof. We often hear about the “black box” nature of these models: you toss in your data, and a result pops out, but what happens in between remains a mystery. Open-source models allow us to leverage tools to quantify things like retrieval accuracy, text generation precision, and semantic similarity. We are unlikely to get the back-end access we need to perform these evaluations. Even if we did, the layers of advanced prompting and the combination of tools employed by vendors behind the scenes could render these evaluations essentially useless.

Even considering only the underlying models (e.g., GPT4 vs Claude), there is no standardized method to evaluate the performance of LLMs across different platforms, leading to inconsistencies. Many different leaderboards evaluate the performance of LLMs in various ways (frequently based on specific subtasks). This is kind of like trying to grade essays from unrelated classes without a rubric—what’s top-notch in one context might not cut it in another. As these technologies evolve, keeping our benchmarks up-to-date and relevant is becoming an ongoing challenge, and without uniform standards, comparing one LLM’s performance to another can feel like comparing apples to oranges.

Then there’s the psychological angle—our human biases. Paul Callister’s work sheds light on this by discussing how cognitive biases can lead us to over-rely on AI, sometimes without questioning its efficacy for our specific needs. Combine this with the output-based evaluation approach, and we’re setting ourselves up for potentially frustrating misunderstandings and errors. The bottom line is that we need some sort of framework for the average user to assess the output.

One note on methods of evaluation: just before publishing this blog post, we learned of a new study from a group of researchers at Stanford, testing the claims of legal research vendors that their retrieval-augmented generation (RAG) products are “hallucination-free.” The group created a benchmarking dataset of 202 queries, many of which were chosen for their likelihood of producing hallucinations. (For example, jurisdiction/time-specific and treatment questions were vulnerable to RAG-induced hallucinations, whereas false premise and factual recall questions were known to induce hallucinations in LLMs without RAG.) The researchers also proposed a unique way of scoring responses to measure hallucinations, as well as a typology of hallucinations. While this is an important advance in the field and provides a way to continue to test for hallucinations in legal research products, we believe hallucinations are not the only weakness in such tools. Our work aims to focus on the concrete applications of these LLMs and probe into the unique weaknesses and strengths of these tools.

The Current State of Prompt Engineering

Since the major AI products were released without a manual, we’ve all had to figure out how to use these tools from scratch. The best tool we have so far is prompt engineering. Over time, users have refined various templates to better organize questions and leverage some of the more surprising ways that AI works.

As it turns out, many of the prompt templates, tips, and tricks we use with the general commercial LLMs don’t carry over well into the legal AI sphere, at least with the commercial databases we have access to. For example, because the legal AIs we’ve tested so far won’t ask you questions, researchers may not be able to have extensive conversations with the AI (or any conversation for some of them). So that means we must devise new types of prompts that will work in the legal AI sphere, and possibly work only in the AI sphere.

We should be able to easily design effective prompts because the data set the AIs use is limited. But it’s not always clear exactly what sources the AI is using. Some databases may list how many cases they have for a certain court by year; others may say “selected cases before 1980” without explaining how they were selected. And even when the databases provide coverage, it may not be clear exactly which of those materials the AI can access.

We still need to determine what prompt templates will be most effective across legal databases. More testing is needed. However, we are limited to the specific databases we can access. While most (all?) academic law librarians have access to Lexis+ AI, Westlaw has yet to release its research product to academics.

Developing a Task Typology

Many of us may have the intuition that there are some legal research tasks for which generative AI tools are more helpful than others. For example, we may find that generative AI is great for getting a working sense of a topic, but not as great for synthesizing a rule from multiple sources. But if we wanted to test that intuition and measure how well AI performed on different tasks, we would need to first define those tasks. This is similar, by the way, to how the LegalBench project approached benchmarking legal analysis—they atomized the IRAC process for legal analysis down to component tasks that they could then measure.

After looking at the legal research literature (in particular Paul Callister’s “problem typing” schemata and AALL’s Principles and Standards for Legal Research Competency), we are beginning to assemble a list of tasks for which legal researchers might use generative AI. We will then group these tasks according to where they fall in an information retrieval schemata for search, following Marchionini (2006) & White (2024), into Find tasks (which require a simple lookup), Learn & Investigate tasks (which require sifting through results, determining relevance, and following threads), and Create, Synthesize, and Summarize tasks (a new type of task for which generative AI is well-suited).

Notably, a single legal research project may contain multiple tasks. Here are a few sample projects applying a preliminary typology:

Again, we may have an initial intuition that generative AI legal research platforms, as they exist today, are not particularly helpful for some of these subtasks. For example, Lexis+AI currently cannot retrieve (let alone analyze) all citing references to a particular case. Nor could we necessarily be certain from, say, CoCounsel’s output, that it contained all cases on point. Part of the problem is that we cannot tell which tasks the platforms are performing, or the data that they have included or excluded in generating their responses. By breaking down problems into their component tasks, and assessing competency on both the whole problem and the tasks, we hope to test our intuitions.

Future Research

We plan on continually testing these LLMs using the framework we develop to identify which tasks are suitable for AIs and which are not. Additionally, we will draft questions and provide rubrics for others to use, so that they can grade AI tools. We believe that other legal AI users will find value in this framework and rubric.

ABA TECHSHOW 2024 Review

Posted on February 20, 2024 by Sean Harrington

Since so many of the AI Law Librarians team were able to attend this year, we thought we would combine some of our thoughts (missed you Sarah!) about this yearly legal technology conference.

Sean

Startup Alley

We arrived in Chicago on a chilly Wednesday morning, amid an Uber & Lyft strike, with plenty of time to take the train from the airport to our hotel. After an obligatory trip to Giordanno’s our students were ready to head over to the Start-up Pitch Competition. I sat with co-blogger Rebecca Fordon during the competition and we traded opinions on the merits of the start-up pitches. We both come from the academic realm and were interested in seeing the types of products that move the needle for attorneys working at firms.

I was familiar with many of the products because I spend a decent portion of my time demo’ing legal tech as part of my current role. It was stiff competition and there were many outstanding options to choose from. Once all of the pitches were done, the audience voted, and then Bob Ambrogi announced the winners. To my great surprise and pleasure, AltFee won! For the uninitiated, AltFee is “a product that helps law firms replace the billable hour with fixed-fee pricing.” This was very interesting to me because I have long thought that LLMs could mean the death knell of the billable hour in certain legal sectors. This was, at least, confirmation that the attorneys attending the TECHSHOW have this on their radar and are thinking through how they are going to solve this problem.

techshow sessions

This year’s schedule of sessions was noticeably heavy on AI-related topics. This was great for me because I’m super interested in this technology and how it is being implemented in the day-to-day life of practitioners. I saw sessions on everything from case management software, to discovery, to marketing, kinda everything.

An especially inspiring couple of sessions for me featured Judge Scott Schlegel on the Fifth Circuit Court of Appeal in Louisiana. Judge Schlegel is the first judge that I’ve seen make fantastic use of AI in United States Courts for access to justice. I am passionate about this topic and have been fishing for grants to try to implement a handful of projects that I have so it was phenomenal to see that there are judges out there who are willing to be truly innovative. Any initiative for access to justice in the courts would require the buy-in of many stakeholders so having someone like Judge Schlegel to point to as a proof of concept could be crucial in getting my projects off the ground. After hearing his presentations I wished that every court in the US had a version of him to advocate for these changes. Importantly, none of his projects require tons of funding or software development. They are small, incremental improvements that could greatly help regular people navigate the court system – while, in many cases, improving the daily lives of the court staff and judges who have to juggle huge caseloads. Please feel free to email grants opportunities in this vein if you see them: sharrington@ou.edu.

side quest: northwestern law ai symposium

In the weeks leading up to the TECHSHOW I received an invite from Prof. Daniel Linna to attend Northwestern University’s AI and Law: Navigating the Legal Landscape of Artificial Intelligence Symposium. I took a frigid hike down to the school in the morning to attend a few sessions before returning to the TECHSHOW in the afternoon. It was a fantastic event with a great mix of attorneys, law professors, and computer science developers.

I was able to see Professor Harry Surden‘s introductory session on how LLM’s work in legal applications. While this information was not “new” to me per se (since I frequently give a similar presentation), he presented this complicated topic in an engaging, clear, and nuanced way. He’s obviously a veteran professor and expert in this area and so his presentation is much better than mine. He gave me tons of ideas on how to improve my own presentations to summarize and analogize these computer science topics to legal professionals, for which I was very grateful.

The second session was a panel that included Sabine Brunswicker, JJ Prescott, and Harry Surden. All were engaged in fascinating projects using AI in the law and I encourage you to take a look through their publications to get a better sense of what the pioneers in our field are doing to make use of these technologies in their research.

Our Students

Each year our school funds a cohort of students to attend the TECHSHOW and this year was no different. This is my first year going with them and I wasn’t sure how much value they would get out of it since they don’t have a ton of experience working in firms using these tools. Was this just a free trip to Chicago or was this pedagogically useful to them?

I will cut to the chase and say that they found this tremendously useful and loved every session that they attended. Law school can (sometimes) get a little disconnected from the day-to-day practice of law and this is a great way to bridge that gap and give the students a sense of what tools attorneys use daily to do their jobs. You’d think that all of the sexy AI-related stuff would be attractive to students but the best feedback came from sessions on basic office applications like MS Outlook and MS Word. Students are definitely hungry for this type of content if you are trying to think through workshops related to legal technology.

In addition to the sessions, the students greatly appreciated the networking opportunities. The TECHSHOW is not overly stuffy and formal and I think they really liked the fact that they could, for example, find an attorney at a big firm working in M&A and pick their brain at an afterparty to get the unfiltered truth about a specific line of work. All of the students said they would go again and I’m going to try to find ways to get even more students to attend next year. If your school ends up bringing students in the future, please reach out to me and we can have our students get together at the event.

Jenny

Jenny live-tweeted the ABA TECHSHOW’s 60 Apps in 60 Minutes and provided links. You can follow her on this exciting journey starting with this tweet:

So excited for 60 in 60! #ABATechshow
— Jennifer Wondracek (@JenWondracek) February 17, 2024

Rebecca

One of the most impactful sessions for me was titled “Revitalize Your Law Firm’s Knowledge Management with AI,” with Ben Schorr (Microsoft) and Catherine Sanders Reach (North Carolina Bar Association). To drive home why KM matters so much, they shared the statistic that knowledge workers spend a staggering 2.5 hours a day just searching for what they need. That resonated with me, as I can recall spending hours as a junior associate looking for precedent documents within my document management system. Even as a librarian, I often spend time searching for previous work that either I or a colleague has done.

To me, knowledge management is one of the most exciting potential areas to apply AI, because it’s such a difficult problem that firms have been struggling with for decades. The speaker mentioned hurdles like data silos (e.g., particular practice areas sharing only among themselves), a culture of hoarding information, and the challenges of capturing and organizing vast amounts of data, such as emails and scanned documents with poor OCR.

The speakers highlighted several AI tools that are attempting to address these issues through improved search going beyond keywords, automating document analysis to aid in categorizing documents, and suggesting related documents. They mentioned Microsoft Copilot, along with process tools like Process Street, Trainual, and Notion. Specific tools like Josef allow users to ask questions of HR documents and policies, rather than hunting for the appropriate documents.

Artificial Intelligence and the Future of Law Libraries Roundtable Events

Posted on February 12, 2024 by Sean Harrington

South Central Roundtable

OU Law volunteered to host the South Central “Artificial Intelligence and the Future of Law Libraries” roundtable and so I was fortunate enough to be allowed to attend. This is the third iteration of a national conversation on what the new AI technologies could mean for the future of law libraries and (more broadly) law librarianship. I thought I would fill you in on my experience and explain a little about the purpose and methodology of the event. The event follows Chatham House Rules so I cannot give you specifics about what anybody said but I can give you an idea of the theme and process that we worked through.

Law Library Director Kenton Brice of OU Law elected to partner with Associate Dean for Library and Technology Greg Ivy and SMU to host the event in Dallas, TX because it was more accessible for many of the people that we wanted to attend. I’d never been to SMU and it’s a beautiful campus in an adorable part of Dallas – here’s a rad stinger I made in Premiere Pro:

Not cleared with SMU’s marketing department

TL;DR: If you get invited, I would highly recommend that you go. I found it enormously beneficial.

History and Impetus

The event is the brainchild of Head of Research, Data & Instruction, Director of Law Library Fellows Program Technology & Empirical Librarian, Cas Laskowsi at the University of Arizona (hereinafter “Cas”). They hosted the inaugural session through U of A’s Washington, DC campus. You may have seen the Dewey B. Strategic article about it since Jean O’Grady was in attendance. The brilliant George H. Pike at Northwestern University hosted the second in the series in Chicago. I know people who have attended each of these sessions and the feedback has been resoundingly positive.

The goal of this collaborative initiative is to provide guidance to law libraries across the country as we work to strategically incorporate artificial intelligence into our operations and plan for the future of our profession.
Cas, from the U of A Website

Methodology

The event takes the entire day and it’s emotionally exhausting, in the best way possible. We were broken into tables of 6 participants. The participants were hand-selected based on their background and experience so that each table had a range of different viewpoints and perspectives.

Then the hosts (in our case, Kenton Brice and Cas Laskowski) walked us through a series of “virtuous cycle, vicious cycle” exercises. They, thankfully, started with the vicious cycle so that you could end each session on a virtuous cycle, positive note. At the end, each table chose a speaker and then we summarized the opinions discussed so that the entire room could benefit from the conversations. Apparently, this is an exercise done at places like the United Nations to triage and prepare for future events. This process went on through 3 full cycles and then we had about an hour of open discussion at the end. We got there at 8am and had breakfast and lunch on-site (both great – thank you Greg Ivy and SMU catering) because it took the entire day.

We had a great mix of academic, government, and private sector presented at the event and the diversity of stakeholders and experiences made for robust and thought-provoking conversation. Many times I would hear perspectives that had never occurred to me and would have my assumptions challenged to refine my own ideas about what the future might look like. Additionally, the presence of people with extensive expertise in specific domains, such as antitrust, copyright, the intricacies of AMLaw100 firms, and the particular hurdles faced in government roles, enriched the discussions with a depth and nuance that is rare to find. Any one of these areas can require years of experience so having a wide range of experts to answer questions allowed you to really “get into the weeds” and think things through thoroughly.

My Experience

I tend to be (perhaps overly) optimistic about the future of these technologies and so it was nice to have my optimism tempered and refined by people who have serious concerns about what the future of law libraries might look like. While the topics presented were necessarily contentious, everybody was respectful and kind in their feedback. We had plenty of time for everybody to speak (so you didn’t feel like you were struggling to get a word in).

You’d think that 8 hours of talking about these topics would be enough but we nearly ran over on every exercise. People have a lot of deep thoughts, ideas, and concerns about the state and future of our industry. Honestly, I would have been happy to have this workshop go on for several days and cover even more topics if that was possible. I learned so much and gained so much value from the people at my table that it was an incredibly efficient way to get input and share ideas.

Unlike other conferences and events that I’ve attended this one felt revolutionary – as in, we truly need to change the status quo in a big way and start getting to work on new ways to tackle these issues. “Disruptive” has become an absolute buzzword inside of Silicon Valley and academia but now we have something truly disruptive and we need to do something about it. Bringing all these intelligent people together in one room fosters an environment where disparate, fragmented ideas can crystalize into actionable plans, enabling us to support each other through these changes.

The results from all of these roundtables are going to be published in a global White Paper once the series has concluded. Each roundtable has different regions and people involved and I can’t wait to see the final product and hear what other roundtables had to say about these important issues. More importantly, I can’t wait to be involved in the future projects and initiatives that this important workshop series creates.

I echo Jean O’Grady: If you get the call, go.

New Resources for Teaching with Legal AI and Keeping Up with the Latest Research

Posted on January 30, 2024 by Sean Harrington

Today’s guest post comes from the University of Denver Sturm College of Law’s John Bliss. Professor Bliss has been kind enough to share some resources that he has crafted to help teach AI to lawyers and law students. In addition, he has launched a new blog which would likely be of interest to our audience so we are happy to host this cross-over event.

Teaching

I recently posted to SSRN a forthcoming article on Teaching Law in the Age of Generative AI, which draws from early experiments with AI-integrated law teaching, surveys of law students and faculty, and the vast new literature on teaching with generative AI across educational contexts. I outline a set of considerations to weigh when deciding how to incorporate this tech in the legal curriculum. And I suggest classroom exercises, assignments, and policies. See https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4682456.

Blog

I’m also organizing a blog that provides up-to-date analysis of research on AI’s legal capabilities. You can subscribe at http://ai-lawyering.blog. Let me know if you’re interested in contributing. The motivation for the blog is that legal AI is a fast-moving field. It is all too common that our discussions are based on outdated and inaccurate information. Empirical findings are often misconstrued in mass and social media. The blog aims to address this issue by reviewing the latest high-quality research, emphasizing empirical studies of AI capabilities as well as scholarship on the implications of this technology for lawyers and other stakeholders in the legal system.

New Program on Generative AI for Legal Practitioners

I’m working with a non-profit teaching lawyers and other members of the legal profession about generative AI: http://oak-academy.org. Just last week, we held our first session with a group of lawyers, law students, and academics. It seemed to go well!

I look forward to continuing conversations on these topics. Please feel free to reach out—jbliss@law.du.edu

Review: vLex’s Vincent AI

Posted on November 30, 2023 by Sean Harrington

Vincent is vLex’s response to implementing AI into legal research and it’s the most impressive one that I’ve seen for legal research. Damien Riehl was kind enough to give us a personalized demonstration (thanks for setting that up, Jenny!) and it was a real treat to be able to ask questions about it in real-time. I would say that the best way to see this in action is to schedule a demo for yourself but if you want to hear my hot-takes about the platform, please keep reading.

Vincent is Really Cool

Interface

Many times when you engage with these models they feel like a complete black-box. You put in some text, 🪄 presto-chango 🪄, and then it spits something back to you that seems related to what you put into it. Vincent instead offers you a fairly controlled interface that is centered around what you typically need for something like real-world legal research. That’s because this doesn’t look like a “chatbot,” sandbox-type experience and feels more like a tool that a professional would use.

You Can Tell Where It Gets the Information

This is huge because almost everything you need is on one page immediately. You ask it to draft a legal research memo and the cases are just to the right of the memo. The relevant portions of the cases have been summarized and presented there for you. A tool tells you how confident Vincent is that this is close to your request. Everything below 70% is dropped. You can toggle between cases, regs, statutes, and secondary materials available. Everything that could require a deeper dive has a hyperlink. You can get a sense of what this looks like from vLex’s website about Vincent here: https://vlex.com/vincent-ai.

Multi-Stage Prompting

vLex is probably best known for its deep archive of primary international materials. Vincent uses this to great effect (since we know that many of these NLP technologies started as translation tools). You can enter a natural language question in English, Vincent will translate it, run the search in the home country’s language, and then provide you with both the original text (so you could translate it yourself) and an English (or whatever) language translation. Sexy stuff for you FCIL researchers. Also, this is substantially more powerful than something that simply tries to grind through many iterations of similar keyword searches in other languages.

It’s also a noteable example of multistage prompting and retrieval in legal research. You can see that it is being fed through not one prompt but many complex chains to produce high-quality, useful output. The tools for US caselaw are similar: Your query is turned into several different prompts that run off in different directions through the vLex database to retrieve information. Some prompts are searching through cases, statutes, regs and their secondary materials to see what is useful; others might be summarizing cases as they relate to your query; other prompts are finding counterarguments; another prompt is evaluating them for confidence on the your specific subject etc. etc. and a final prompt is summarizing all of this information into a neat little report for you. In summary, they’re making great use of the technology’s potential by deploying it in many different ways. The final product is sort of a fabricated, personalized secondary source created by running tons of prompts over the underlying primary materials. In fact, Damien calls this a “Me-tise” 😂 (apologies to Damien if I stole his punchline) and he foresees it being a powerful new tool for legal researchers. I’ve been bullish on the fabrication of secondary materials since I first saw what these things could do so it was exciting to see a precursor of this in action.

Damien let us know that behind the scenes they are using a combination of the various LLM’s to achieve these results and cut costs when possible: Claude, Llama2 (Meta), and GPT4. We met with him shortly after the OpenAI controversy and he pointed out that they are able to swap models in vLex if necessary.

Secondary Materials and Market Share

We have all come to love and rely on specific secondary materials that exist in Westlaw and Lexis. vLex’s acquisition of Fastcase meant that they acquired a huge, fantastic database of primary US materials. The one pain point for people who may be interested in switching from Westlaw/Lexis to Fastcase was the relative dearth of secondary materials available. The features that I saw last week in vLex may fill that need for some users and it will be interesting to see if people are lured away from their favorite practice guide or treatise published by Lexis or Thomson Reuters because a robot can now do some of that work summarizing and analyzing vast quantities of primary law. It will also be interesting to see if Lexis and Westlaw will roll out these types of features, since they could be in direct competition with their robust (and pricey) secondary materials offerings.

Before I get a slew of angry emails: I recognize that a traditional secondary material does much more than summarize cases, statutes, and regulations but it does some of that (also remember we’re still in the infancy of this technology for legal research). If that is all the researcher needs, then these tools could work as a replacement for some people (and they don’t rely on monthly updates – they do this on demand). That may allow some people to cut ties from Lexis and Westlaw in a way that could shake up the industry in a way that disrupts the status quo. It could also be incredibly powerful for something like a 50-state survey or even surveys across many different countries. Feel free to let me know what an ignoramus I am in the comments if I am missing something here.

Outstanding Questions

Price

I’ll dive right in where you all have questions, “Can we afford this thing?” Dunno and it depends (super satisfying, I know). The difficulty here is that these things are still very expensive to operate. The more sophisticated the model, the larger the database, the more complex the stages of prompting, the various modalities (scanning documents, reading the screen, etc.) – the more it costs them. They are all trying to figure out how to create a pricing structure where they can 1) offer it to the widest audience possible and 2) remain profitable. As we know, their primary source of revenue is the big firms and so the product is currently only available in paid beta for select companies.

Damien and vLex are both refreshingly upfront and clear about this. No hand-waving or sales talk, which I think is why so many people in our industry look to people like Damien for information about these technologies as they are developed. Damien mentioned that they are taking the “democratize the law” call to action from Fastcase seriously and are looking for ways to make it affordable on the academic market.

Possible Future Options

This is all complete speculation on my part but some sort of limited version of the platform seems like it could be reasonable for the academic market (like BLaw does with their dockets): limited uses per day, limited uses per account, a “lesser” account with limited features, etc. As the market stands today academic law libraries have access to a limited version of Lexis AI, trial access to Casetext Cocounsel (unless you’re willing to pay), no access to Westlaw Copilot, no access to Harvey AI, and no access to vLex. I anticipate all of that will change as the prices come down. The point of frustration is obviously that we want to be able to evaluate these tools so that we can teach them to students, in addition to using them ourselves so that we can benefit from the technology.

In conclusion, Vincent by vLex represents a significant step forward in AI-driven legal research. Its sophisticated multi-stage prompting, transparent sourcing, and potential in fabricating secondary materials make it a formidable tool. The future of Vincent and similar AI platforms in the academic and broader legal research community is certainly something to watch closely.

Demystifying LLMs: Crafting Multiple Choice Questions from Law Outlines

Posted on November 13, 2023 by Sean Harrington

In today’s post, we’ll explore how legal educators and law students can use Large Language Models (LLMs) like ChatGPT and Claude to create multiple-choice questions (MCQs) from a law school outline.

Understanding the Process

My first attempt at this was to simply ask the LLM the best way to make MCQs but it didn’t end up being particularly helpful feedback, so I did some digging. Anthropic recently shed light on their method of generating multiple-choice questions, and it’s a technique that could be immensely beneficial for test preparation – besides being a useful way to conceptualize how to make effective use of the models for studying. They utilize XML tags, which may sound technical, but in essence, these are just simple markers used to structure content. Let’s break down this process into something you can understand and use, even if you’re not a wizard at Technical Services who is comfortable with XML.

Imagine you have a law school outline on federal housing regulations. You want to test your understanding or help students review for exams. Here’s how an LLM can assist you:

STEP 1: Prepare Your Outline

Ensure that your outline is detailed and organized. It should contain clear sections, headings, and bullet points that delineate topics and subtopics. This structure will help the LLM understand and navigate your content. If you’re comfortable using XML or Markdown, this can be exceptionally helpful. Internally, the model identifies the XML tags and the text they contain, using this structure to generate new content. It recognizes the XML tags as markers that indicate the start and end of different types of information, helping it to distinguish between questions and answers.

The model uses the structure provided by the XML tags to understand the format of the data you’re presenting.

STEP 2: Uploading the Outline

Upload your outline into the platform that you’re using. Most platforms that host LLMs will allow you to upload a document directly, or you may need to copy and paste the text into a designated area.

STEP 3: Crafting a General Prompt

You can write a general prompt that instructs the LLM to read through your outline and identify key points to generate questions. For example:

“Please read the uploaded outline on federal housing regulations and create multiple-choice questions with four answer options each. Focus on the main topics and legal principles outlined in the document.”

STEP 4: Utilizing Advanced Features

Some LLMs have advanced features that can take structured or semi-structured data and understand the formatting. These models can sometimes infer the structure of a document without explicit XML or Markdown tags. For instance, you might say:

“Using the headings and subheadings as topics, generate multiple-choice questions that test the key legal concepts found under each section.”

AND/OR

Give the model some examples with XML tags (so it can better replicate what you would like “few shot prompting”):

<Question>
What are "compliance costs" in HUD regulations?
</Question>
<Answers>
1. Fines for non-compliance.
2. Costs associated with adhering to HUD regulations.
3. Expenses incurred during HUD inspections.
4. Overheads for HUD compliance training.
</Answers>

The more examples you give, the better it’s going to be.

AND/OR

You can also use the LLM to add these XML tags depending on the size of your outline and the context limit of the model you are using (OpenAI recently expanded their limit dramatically). Give it a prompt asking it to apply tags and give it an example of the types of tags you would like for your content. Then tell the model to do it with the rest of your outline:

<LawSchoolOutline>
    <CourseTitle>Constitutional Law</CourseTitle>
    <Section>
        <SectionTitle>Executive Power</SectionTitle>
        <Content>
            <SubSection>
                <SubSectionTitle>Definition and Scope</SubSectionTitle>
                <Paragraph>
                    Executive power is vested in the President of the United States and is defined as the authority to enforce laws and ensure they are implemented as intended by Congress.

STEP 5: Refining the Prompt

It is very rare that my first try with any of these tools produces fantastic output. It is often a “conversation with a robot sidekick” (as many of you have heard me say at my presentations) and requires you to nudge the model to create better and better output.

If the initial questions need refinement, you can provide the LLM with more specific instructions. For example:

“For each legal case mentioned in the outline, create a question that covers the main issue and possible outcomes, along with incorrect alternatives that are plausible but not correct according to the case facts.”

Replicating the Process

Students can replicate this process for other classes using the same prompt. The trick here is to stay as consistent as possible with the way that you structure and tag your outlines. It might feel like a lot of work on the front end to create 5+ examples, apply tags, etc. but remember that this is something that can be reused later! If you get a really good MCQ prompt, you could use it for every class outline that you have and continue to refine it going forward.

Big Brother

Posted on November 9, 2023 by Sean Harrington

Guest post from Andrew Dang, ASU Law Student and LLM Developer.

This week, OpenAI announced new features to their platform at their first key-note event, including a new GPT-4 Turbo with 128K context, GPT-4 Turbo with Vision, DALL·E 3 API, and more. Furthermore, announced their agent Assistants API, including their own retrieval augmentation pipeline. (RAG) Today, we will focus on OpenAI’s entry into the RAG market.

At the surface level, RAG boils down to text generation models like Chat-GPT, retrieving data such as documents to assist users with questions and answering, summarization, and so on. Behind the scenes, however, other factors are at play such as vector databases, document chunking, and embedding models. Most RAG pipelines rely on an external vector database and require compute to create the embeddings. However, what OpenAI’s retrieval tool brings to the table is an all-encompassing RAG system. The system eliminates the need for external databases, and compute required to create and store the embeddings. Whether OpenAI’s retrieval system is optimal is a story for another day. Today we are focusing on the data implications.

Data is the new currency fueling the new economy. Big Tech aims to take control of the economy by ingesting organizations’ private data including IP, leading to a “monolithic system” that completely controls users’ data. Google, Microsoft Adobe, and OpenAI are now offering indemnification to their users against potential copyright infringement lawsuits related to Generative AI, aiming to protect their business model by ensuring more favorable legal precedents. This strategy is underscored by the argument that both the input (ideas, which are uncopyrightable) and the output (machine-generated expressions, deemed uncopyrightable by the US Copyright Office) of Generative AI processes do not constitute copyright infringement. The consequences of Big Tech having their way could be dire, leading us to a cyberpunk dystopia that none of us want to live in. Technology and its algorithms would be in charge, and our personal data could be used to manipulate us. Our data reveals our interests, private health information, location status, etc. When algorithms feed us only limited, targeted information based on our existing interests and views, it restricts outside influence and diversity of opinion that is crucial to freedom of thought. Organizations must not contribute to this cyberpunk dystopia where Big Tech becomes Big Brother. Furthermore, companies are putting their employees, clients, and stakeholders at risk when handing data to Big Tech. Big Tech favors the role of tort feasor, rather than the role of the good Samaritan, and complies with consumer privacy laws.

To prevent Big Brother, organizations should implement their own RAG pipeline. Open-source frameworks such as Llama-index, Qdrant, and Langchain can be used to create powerful RAG pipelines with your privacy and interests protected. LLMWaare also released an open-source RAG pipeline and domain-specific embedding models. Generative AI is a powerful tool and can enhance our lives, but at the same time in the wrong hands, the cyberpunk nightmare can become a reality. The ease of using prebuilt, turn-key systems, such as those offered by OpenAI, is appealing. However, the long-term risks associated with entrusting our valuable data to corporations, without a regulatory framework or protections, raise concerns about a potentially perilous direction.

Beware the Legal Bot: Spooky Stories of AI in the Courtroom

Posted on October 31, 2023 by Sean Harrington

The “ChatGPT Attorney” case has drawn much attention, but it’s not the only example of lawyers facing problems with AI use. This blog will compile other instances where attorneys have gotten into trouble for incorporating AI into their practice. Updates will be made as new cases or suggestions arise, providing a centralized resource for both legal educators and practicing attorneys (or it can be used to update a Libguide 😉). I’ll will also add this to one of our menus or headings for easy access.

https://www.damiencharlotin.com/hallucinations/

Damien Charlotin has put together a fantastic database of attorneys getting caught using AI improperly. Instead of reinventing the wheel, I will simply link to his fantastic resource.

Attorney Discipline

Park v. Kim, No. 22-2057, 2024 WL 332478 (2d Cir. Jan. 30, 2024)

“Attorney Jae S. Lee. Lee’s reply brief in this case includes a citation to a non-existent case, which she admits she generated using the artificial intelligence tool ChatGPT. Because citation in a brief to a non-existent case suggests conduct that falls below the basic obligations of counsel, we refer Attorney Lee to the Court’s Grievance Panel, and further direct Attorney Lee to furnish a copy of this decision to her client, Plaintiff-Appellant Park.”

Mata v. Avianca, Inc. (1:22-cv-01461) District Court, S.D. New York

I will not belabor the ChatGPT attorney (since it has been covered by real journalists like the NYT) – only provide links to the underlying dockets in case you need them since I get asked for them fairly often:

https://www.bloomberglaw.com/document/2048130161736962
(To request individual filings – BLaw subscription is required)
https://www.courtlistener.com/docket/63107798/mata-v-avianca-inc/
Public version with limited access

(Fireworks start at the May 4, 2023 OSC)

Zachariah Crabhhill, Colorado Springs

In a less publicized case from Colorado, an attorney, Zachariah Crabhill, relied on ChatGPT to draft a legal motion, only to find out later that the cited cases were fictitious. Unfortunately, the court filings are not accessible through El Paso County’s records or Bloomberg Law. If any Colorado law librarians can obtain these documents, please contact me, and I’ll update this post accordingly.

News articles:

Zachariah was subsequently sanctioned and suspended:

https://coloradosupremecourt.com/PDJ/Decisions/Crabill,%20Stipulation%20to%20Discipline,%2023PDJ067,%2011-22-23.pdf

Ex Parte Allen Michael Lee, No. 10-22-00281-CR, 2023 WL 4624777 (Tex.
Crim. App. July 19, 2023)

An Opinion of Chief Justice Tom Grey explains that Allen Michael Lee faces charges related to child sexual assault, with bail set at $400,000, which he hasn’t been able to post. Lee sought a bail reduction through a pre-trial habeas corpus application, but the court denied this, leading Lee to argue that the denial was an abuse of discretion due to excessive initial bail. However, his appeal was critiqued for inadequate citation, as the cases he referenced either didn’t exist or were unrelated to his arguments

Updates:

David Wagner, This Prolific LA Eviction Law Firm Was Caught Faking Cases In Court. Did They Misuse AI?, LAist (Oct 12, 2023)
Submitted by my co-author Rebecca Fordon

“Cuddy Law Firm in New York has been submitting exhibits of transcripts of interactions with ChatGPT to their motions for attorneys fees (essentially a back and forth to zero in on what is a reasonable rate) in several cases in S.D. NY.”
[This is an ongoing action and we’re waiting to see if it is allowed]
from reader Jason as a comment (very much appreciated, Jason)

A Spooky Glimpse into the Future

In 2019, Canadian Judge Whitten reduced an attorney’s requested fees on the grounds that the attorney had not utilized AI technology:

The decision concerned a request for attorneys’ fees and expenses by defendant, Port Dalhousie Vitalization Corporation (PDVC). The court granted summary judgment in PDVC’s favor against a woman who sued PDVC after she slipped and fell at an Ontario bar for which PDVC was the landlord. The bar, My Cottage BBQ and Brew, defaulted in the case. In his ruling, Justice Whitten mentioned that the use of AI in legal research would have reduced the amount of time one of the attorneys for the defendant would have spent preparing his client’s case.
https://www.lexisnexis.com/community/insights/legal/b/thought-leadership/posts/judge-slams-attorney-for-not-using-ai-in-court

In domains where AI can significantly expedite workflows, it could indeed become standard practice for judges to scrutinize fee requests more rigorously. Attorneys might be expected to leverage the latest technological tools to carry out tasks more efficiently, thereby justifying their fees. In this scenario, sticking to traditional, manual methods could be perceived as inefficient, and therefore, not cost-effective, leading to fee reductions. This has led many people to wonder if AI will expedite the decline of the billable hour (for more on that please see this fantastic discussion on 3 Geeks and a Law Blog, AI-Pocalypse: The Shocking Impact on Law Firm Profitability).

We hope that you have a Happy Halloween!