Leapfrogging the Competition: Claude 3 Researches and Writes Memos (Better Than Some Law Students and Maybe Even Some Lawyers?)

Introduction

I’ve been incredibly excited about the premium version of Claude 3 since its release on March 4, 2024, and for good reason. Now that my previous favorite chatty chatbot, ChatGPT-4, has gone off the rails, I was missing a competent chatbot… I signed up the second I heard on March 4th, and it has been a pleasure to use Claude 3 ever since. It actually understands my prompts and usually provides me with impressive answers. Anthropic, maker of the Claude chatty chatbot family, has been touting Claude’s accomplishments of supposedly beating its competitors on common chatbot benchmarks, and commentators on the Internet have been singing its praises. Just last week, I was so impressed by its ability to analyze information in news stories in uploaded files that I wrote a LinkedIn post also singing its praises!

Hesitation After Previous Struggles

Despite my high hopes for its legal research abilities after experimenting with it last week, I was hesitant to test Claude 3. I have a rule about intentionally irritating myself—if I’m not already irritated, I don’t go looking for irritation… Over the past several weeks, I’ve wasted countless hours trying to improve the legal research capabilities of ChatGPT-3.5, ChatGPT-4, Microsoft Copilot, and my legal research/memo writing GPTs through the magic of (IMHO) clever prompting and repetition. Sadly, I failed miserably and concluded that either ChatGPT-4 was suffering from some form of robotic dementia, or I am. The process was a frustrating waste, and I knew that Claude 3 doing a bad job of legal research too could send me over the edge….

Claude 3’s Wrote a Pretty Good Legal Memorandum!

Luckily for me, when I finally got up the nerve to test out the abilities of Claude 3, I found that the internet hype was not overstated. Somehow, Claude 3 has suddenly leapfrogged over its competitors in legal research/legal analysis/legal memo writing ability – it instantly did what would have taken a skilled researcher over an hour and produced a better legal memorandum which is probably better than that produced by many law students and even some lawyers. Check it out for yourself! Unless this link actually works for any Claude 3 subscribers out there, there doesn’t seem to be a way to actually link to a Claude 3 chat at this time. However, click here for the whole chat I cut and pasted into a Google Drive document, here for a very long screenshot image of the chat, or here for the final 1,446-word version of the memo as a Word document.

Comparing Claude 3 with Other Systems

Back to my story… The students’ research assignment for the last class was to think of some prompts and compare the results of ChatGPT-3.5, Lexis+ AI, Microsoft Copilot, and a system of their choice. Claude 3 did not exist at the time, but I told them not to try the free Claude product because I had canceled my $20.00 subscription to the Claude 2 product in January 2024 due to its inability to provide useful answers – all it would say was that it was unethical to answer every question and tell me to do it myself. When creating an answer sheet before class tomorrow which compares the same set of prompts on different systems, I decided to omit Lexis+ AI (because I find it useless) and to include my new fav Claude 3 in my comparison spreadsheet. Check it out to compare for yourself!

For the research part of the assignment, all systems were given a fact pattern and asked to “Please analyze this issue and then list and summarize the relevant Texas statutes and cases on the issue.” While the other systems either made up cases or produced just two or three actual real and correctly cited cases on the research topic, Claude 3 stood out by generating 7 real, relevant cases with correct citations in response to the legal research question. (And, it cited to 12 cases in the final version of its memo.)

It did a really good job of analysis too!

Generating a Legal Memorandum

Writing a memo was not part of the class assignment because the ChatGPT family was refusing the last few weeks,* and Bing Copilot had to be tricked into writing one as part of a short story, but after seeing Claude 3’s research/analysis results, I decided to just see what happened. I have many elaborate prompts for ChatGPT-4 and my legal memorandum GPTs, but I recalled reading that Claude 3 worked well with zero-shot prompting and didn’t require much explanation to produce good results. So, I decided to keep my prompt simple – “Please generate a draft of a 1500 word memorandum of law about whether Snurpa is likely to prevail in a suit for false imprisonment against Mallatexaspurses. Please put your citations in Bluebook citation format.”

From my experience last week with Claude 3 (and prior experience with Claude 2 which would actually answer questions), I knew the system wouldn’t give me as long an answer as requested. The first attempt yielded a pretty high-quality 735-word draft memo that cited all real cases with the correct citations*** and applied the law to the facts in a well-organized Discussion section. I asked it to expand the memo two more times, and it finally produced a 1,446-word document. Here is part of the Discussion section…

Implications for My Teaching

I’m thrilled about this great leap forward in legal research and writing, and I’m excited to share this information with my legal research students tomorrow in our last meeting of the semester. This is particularly important because I did such a poor job illustrating how these systems could be helpful for legal research when all the compared systems were producing inadequate results.

However, with my administrative law legal research class starting tomorrow, I’m not sure how this will affect my teaching going forward. I had my video presentation ready for tomorrow, but now I have to change it! Moreover, if Claude 3 can suddenly do such a good job analyzing a fact pattern, performing legal research, and applying the law to the facts, how does this affect what I am going to teach them this semester?

*Weirdly, the ChatGPT family, perhaps spurred on by competition from Claude 3, agreed to attempt to generate memos today, which it hasn’t done in weeks…

Note: Claude 2 could at one time produce an okay draft of a legal memo if you uploaded the cases for it, that was months ago (Claude 2 link if it works for premium subscribers and Google Drive link of cut and pasted chat). Requests in January resulted in lectures about ethics which resulted in the above-mentioned cancellation.

Does ChatGPT-4 Have Dementia?

Is it just me, or has ChatGPT-4 taken a nosedive when it comes to legal research and writing? There has been a noticeable decline in its ability to locate primary authority on a topic, analyze a fact pattern, and apply law to facts to answer legal questions. Recently, instructions slide through its digital grasp like water through a sieve, and its memory? I would compare it to a goldfish, but I don’t want to insult them. And before you think it’s just me, it’s not just me, the internet agrees!

ChatGPT’s Sad Decline

One of the hottest topics in the OpenAI community, in the aptly named GPT-4 is getting worse and worse every single update thread, is the perceived decline in the quality and performance of the GPT-4 model, especially after the November 2023 update. Many users have reported that the model is deteriorating with each update, producing nonsensical, irrelevant, or incomplete outputs, forgetting the context, and ignoring instructions. Some users have even reverted to previous versions of the model or cancelled their subscriptions. Here are some specific quotations from recent comments about the memory problem:

  • December 2023 – “I don’t know what on Earth is wrong with GPT 4 lately. It feels like I’m talking to early 3.5! It’s incapable of following basic instructions and forgets the format it’s working on after just a few posts.”
  • December 2023 – “It ignores my instructions, in the same message. I can’t be more specific with what I need. I’m needing to repeat how I’d like it to respond every single message because it forgets, and ignores.”
  • December 2023 – “ChatGPT-4 seems to have trouble following instructions and prompts consistently. It often goes off-topic or fails to understand the context of the conversation, making it challenging to get the desired responses.”
  • January 2024 – “…its memory is bad, it tells you search the net, bing search still sucks, why would teams use this product over a ChatGPT Pre Nov 2023.”
  • February 2024 – “It has been AWFUL this year…by the time you get it to do what you want format wise it literally forgets all the important context LOL — I hope they fix this ASAP…”
  • February 2024 – “Chatgpt was awesome last year, but now it’s absolutely dumb, it forgets your conversation after three messages.”

OpenAI has acknowledged the issue and released an updated GPT-4 Turbo preview model, which is supposed to reduce the cases of “laziness” and complete tasks more thoroughly. However, the feedback from users is still mixed, and some are skeptical about the effectiveness of the fix.

An Example of Confusion and Forgetfulness from Yesterday

Here is one of many examples of my experiences which provide an illustrative example of the short-term memory and instruction following issues that other ChatGPT-4 users have reported. Yesterday, I asked it to find some Texas cases about the shopkeeper’s defense to false imprisonment. Initially, ChatGPT-4 retrieved and summarized some relatively decent cases. Well, to be honest, it retrieved 2 relevant cases, with one of the two dating back to 1947… But anyway, the decline in case law research ability is a subject for another blog post.

Anyway, in an attempt to get ChatGPT-4 to find the cases on the internet so it could properly summarize them, I provided some instructions and specified the format I wanted for my answers. Click here for the transcript (only available to ChatGPT-4 subscribers).

Confusion ran amok! ChatGPT-4 apparently understood the instructions (which was a positive sign) and presented three cases in the correct format. However, they weren’t the three cases ChatGPT had listed; instead, they were entirely irrelevant to the topic—just random criminal cases.

It remembered… and then forgot. When reminded that I wanted it to work with the first case listed and provided the citation, it apologized for the confusion. It then proceeded to give the correct citation, URL, and a detailed summary, but unfortunately in the wrong format!

Eventually, in a subsequent chat, I successfully got it to take a case it found, locate the text of the case on the internet, and then provide the information in a specified format. However, it could only do it once before completely forgetting about the specified format. I had to keep cutting and pasting the instructions for each subsequent case.

Sigh… I definitely echo the sentiments of expressed on the GPT-4 is getting worse and worse every single update thread.

ChatGPT Is Growing a Long Term Memory

Well, the news is not all bad! While we are on the topic of memory, OpenAI has introduced a new feature for ChatGPT – the ability to remember stuff over time. ChatGPT’s memory feature is being rolled out to a small portion of free and Plus users, with broader availability planned soon. According to OpenAI, this enhancement allows ChatGPT to remember information from past interactions, resulting in more personalized and coherent conversations. During conversations, ChatGPT automatically picks up on details it deems relevant to remember. Users can also explicitly instruct ChatGPT to remember specific information, such as meeting note preferences or personal details. Over time, ChatGPT’s memory improves as users engage with it more frequently. This memory feature could be useful for users who want consistent responses, such as replying to emails in a specific format.

The memory feature can be turned off entirely if desired, giving users control over their experience. Deleting a chat doesn’t erase ChatGPT’s memories; users must delete specific memories individually…which seems a bit strange – see below. For conversations without memory, users can use temporary chat, which won’t appear in history, won’t use memory, and won’t train the AI model.

The Future?

As we await improvements to our once-loved ChatGPT-4, our options remain limited, pushing us to consider alternative avenues. Sadly, I’ve encountered recent similar shortcomings with the once-useful for legal research and writing Claude 2. In my pursuit of alternatives, platforms like Gemini, Perplexity, and Hugging Face have proven less than ideal for research and writing tasks. However, amidst these challenges, Microsoft Copilot has shown promise. While not without its flaws, it recently demonstrated adequate performance in legal research and even took a passable stab at a draft of a memo. Given OpenAI’s recent advancements in the form of Sora, the near-magical text-to-video generator that is causing such hysteria in Hollywood, there’s reason to hope that they can pull ChatGPT back from the brink.

Birth of the Summarizer Pro GPT: Please Work for Me, GPT

Last week, my plan was to publish a blog post about creating a GPT goofily self-named Summarizer Pro to summarize articles and organize citation information in a specific format for inclusion in a LibGuide. However, upon revisiting the task this week, I find myself first compelled to discuss the recent and thrilling advancements surrounding GPTs – the ability to incorporate GPTs into a ChatGPT conversation.

What is a GPT?

But, first of all, what is a GPT? The OpenAI website explains that GPTs are specialized versions of ChatGPT designed for customized applications. These unique GPTs enable anyone to modify ChatGPT for enhanced utility in everyday activities, specific tasks, professional environments, or personal use, with the added ability to share these personalized versions with others.

To create or use a GPT, you need access to ChatGPT’s advanced features, which require a paid subscription. Building your own customized GPT does not require programming skills. The process involves starting a chat, giving instructions and additional information, choosing capabilities like web searching, image generation, or data analysis, and iteratively testing and improving the GPT. Below are some popular examples that ChatGPT users have created and shared in the ChatGPT store:

GPT Mentions

This was already exciting, but last week they introduced a feature that takes it to the next level – users can now invoke a specialized GPT within a ChatGPT conversation. This is being referred to as “GPT mentions” online. By typing the “@” symbol, you can choose from GPTs you’ve used previously for specific tasks. Unfortunately, this feature hasn’t rolled out to me yet, so I haven’t had the chance to experiment with it, but it seems incredibly useful. You can chat with ChatGPT as normal while also leveraging customized GPTs tailored to particular needs. For example, with the popular bots listed above, you could ask ChatGPT to summon Consensus to compile articles on a topic. Then call on Write For Me to draft a blog post based on those articles. Finally, invoke Image Generator to create a visual for the post. This takes the versatility of ChatGPT to the next level by integrating specialized GPTs on the fly.

Back to My GPT Summarizer Pro

Returning to my original subject, which is employing a GPT to summarize articles for my LibGuide titled ChatGPT and Bing Chat Generative AI Legal Research Guide. This guide features links to articles along with summaries on various topics related to generative AI and legal practice. Traditionally, I have used ChatGPT (or occasionally Bing or Claude 2, depending on how I feel) to summarize these articles for me. It usually performs admirably well on the summary part, but I’m left to manually insert the title, publication, author, date, and URL according to a specific layout. I’ve previously asked normal old ChatGPT to organize the information in this format, but the results have been inconsistent. So, I decided to create my own GPT tailored for this task, despite having encountered mixed outcomes with my previous GPT efforts.

Creating GPTs is generally a simple process, though it often involves a bit of fine-tuning to get everything working just right. The process kicks off with a set of questions… I outlined my goals for the GPT – I needed the answers in a specific format, including the title, URL, publication name, author’s name, date, and a 150-word summary, all separated by commas. Typically, crafting a GPT involves some back-and-forth with the system. This was exactly my experience. However, even after this iterative process, the GPT wasn’t performing exactly as I had hoped. So, I decided to take matters into my own hands and tweak the instructions myself. That made all the difference, and suddenly, it began (usually) producing the information in the exact format I was looking for.

Summarizer Pro in Action!

Here is an example of Summarizer Pro in action! I pasted a link to an article into the text box and it produced the information in the desired format. However, reflecting the dynamic nature of ChatGPT responses, the summaries generated this time were shorter compared to last week. Attempts to coax it into generating a longer or more detailed summary were futile… Oh well, perhaps they’ll be longer if I try again tomorrow or next week.

Although it might not be the most fancy or thrilling use of a GPT, it’s undeniably practical and saves me time on a task I periodically undertake at work. Or course, there’s no shortage of less productive, albeit entertaining, GPT applications, like my Ask Sarah About Legal Information project. For this, I transformed around 30 of my blog posts into a GPT that responds to questions in the approximate manner of Sarah.

ChatGPT-4 Can See Us Now! And Our Desiccated Potatoes…

Within the rapidly advancing realm of generative AI, ChatGPT is expanding its inventory of human senses available for its GPT-4 subscribers. For a mere $20 a month, you can experience its new ability to see, speak, and hear! One of these senses, the “vision” enhancement, allows the user to upload images which ChatGPT can analyize and answer questions about. Currently, YouTube is abuzz with endless creative demonstrations of this technology – solving math problems, calculating food calories in a shopping cart, identifying landmarks in photos, reading CT scans, turning sketches into art, creating fashion designs, tracking wildlife, aging or de-aging faces, detecting plant diseases, restoring old photos, suggesting interior design changes, and analyzing aerial images for urban development and natural disasters. Multiple videos are highlighting an intriguing paper, “The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision),” which explores over 100 potential applications of this feature, examining how well the system performs for each.

If you don’t subscribe to ChatGPT-4, you can still check out the image analysis concept using the free open source LLaVA: Large Language Model Vision Assistant.

Originally for this blog post, I was going to examine a few examples of possible uses of this technology for legal practice. However, I got bogged down by my first example, analyzing evidence in photographs, because ChatGPT had so much to say about the image I uploaded! So, I’m focusing solely on that one.

Photograph Analysis

Photographs obviously play a huge role in legal practice. Lawyers, prosecutors, jurors, and others often rely heavily on photographs to understand events and assign blame – be it a photograph from a car accident, a crime scene, or even the positioning of a car in a parking violation scenario. Of course, in crime dramas and true crime documentaries, they either have a brilliant detective who identifies the key piece of evidence in the picture, or it is crowdsourced by the public when it appears in the news. Now, with ChatGPT’s newfound ability to “see,” it’s essentially doing its own form of crowdsourcing. Using the wisdom of countless human eyes contained in its massive training data, ChatGPT-4 can assist and provide insights that an individual’s human eyes can miss.

Example – Analysis of Office Shelf Photograph

To test its abilities, I snapped a goofy photo of a shelf in my office and quizzed ChatGPT about the things it spotted. I was pretty impressed by its descriptive abilities and knowledge about some of the items. Exploring its possible uses for like, the police, I pretended the picture was taken in the home of an international assassin, and then asked for some clues about the assassin. I think that it had some interesting ideas!

I uploaded the image and asked away!

Me: Hi! Please list the items you recognize in this picture.

ChatGPT: Sure! Here are the items I recognize in the picture:

  1. A green piñata (or decorative item) made of fringed paper.
  2. A figurine of an animal, possibly a kangaroo or wallaby, holding red objects (perhaps apples).
  3. A piece of wood or nut with googly eyes attached, resembling a face with an open mouth.
  4. Two glass or ceramic bottles with decorative paintings or inscriptions.
  5. A candy cane or similarly striped stick.
  6. A wooden hand sculpture.
  7. A tray or plate with a pink border.
  8. Various stickers or decals on a gray surface, including ones that say “Night Owl”, “Juicy”, and others.

Let me know if you need more information on any of these items!

Me: Great! You see the little bottle with the nature scene on it? Tell me what you know about that picture.

ChatGPT: Certainly! The bottle you’re referring to appears to be a type of “snuff bottle.” Snuff bottles are small bottles which were primarily used in ancient China to hold powdered tobacco (snuff). These bottles became popular during the Qing Dynasty. The use of snuff was a fashion among the Chinese during this period, and they liked to carry it in small bottles like the one in your picture…. (Note: It continued for 3 more paragraphs.)

Me: Can you read the Chinese characters on the bottle?

ChatGPT: (Note: No, too small, but could read it if I uploaded another picture.)

Me: If I told you that the brown thing with eyes is a dessicated potato, could you take a guess at how long it has been drying out?

ChatGPT: While I can provide a general overview based on the image and your description, it’s important to note that accurately determining the exact duration of desiccation for a potato based solely on a photograph is challenging.

That said, when a potato dries out, it loses its moisture and shrinks, and the skin can become wrinkled and tough. Over time, as the potato continues to dry, it may become lighter in weight and more shriveled in appearance. The potato in the image does appear quite shriveled and dried out.

However, the exact duration of desiccation can vary based on several factors, including.. (Note: It goes on with a list of factors.)

Me: Yeah, that was a good guess! I accidentally left it in my office when I went to the AALL conference in July, and it is October 18 now.

ChatGPT: Ah, that makes sense! Leaving a potato in an environment like an office for several months would certainly lead to the kind of desiccation observed in the image. The absence of moisture and a relatively stable environment would have contributed to

its drying out. It’s interesting to see how everyday items can transform over time when left in certain conditions. I hope your AALL conference went well, and thanks for sharing this intriguing observation with me!

Me: Yeah, I was surprised at how it looked after a week! Can you tell what that is in the potato’s mouth?

ChatGPT: (No, didn’t guess correctly.)

Me: What if I gave you a hint? The tooth fairy.

ChatGPT: Ah, with that hint in mind, it seems the object in the potato’s mouth could represent a “tooth”! Perhaps it’s a playful representation of the potato losing a tooth and waiting for the tooth fairy. In that context, the object could be made of a variety of materials resembling a tooth, such as a piece of white foam, plastic, or even a small piece of chalk or stone. It’s a fun and creative idea!

Me: Actually, one day I was looking at the potato, and I saw one of my daughter’s baby teeth on the shelf! I don’t know when I put it there, but I do remember finding one in my wallet and thinking that I should hold on to it. So, I assume I stuck it on my shelf for safekeeping.

ChatGPT: That’s both surprising and heartwarming! Finding personal mementos in unexpected places can bring back a rush of memories and emotions. Baby teeth are often cherished keepsakes for many parents, reminding them of the fleeting nature of childhood. Your discovery adds a whimsical twist to the tradition of the tooth fairy! The potato unintentionally became a guardian for your daughter’s baby tooth. It’s a unique and memorable story you can share with her in the future.

Me: If this picture would taken by the police at the home of, say, an international assassin, would it give us any clues about the assassin?

GhatGPT: While it’s fun to imagine scenarios, it’s essential to approach such questions with a good dose of caution and skepticism. Using the image you provided, let’s hypothesize in a fictional context:

If the image were taken at the home of an “international assassin” in a crime novel or movie, investigators might look for clues in the following ways:

  1. Personal Touches: The dessicated potato with a baby tooth could suggest a sentimental side or a connection to family. The tooth might indicate the assassin has a child or keepsakes from their past.
  2. Cultural Clues: The snuff bottle with Chinese characters could hint at a connection to China, either personally, professionally, or as a destination of interest.
  3. Hobbies or Interests: The various trinkets and decorations might suggest that the assassin has an interest in art, culture, or collecting unique items.
  4. Stickers and Decals: The stickers on the gray surface might provide clues about places they’ve visited, brands they like, or organizations they support or are involved in.
  5. Behavioral Clues: The fact that the potato was left to desiccate for months might indicate a person who travels frequently or isn’t often at home.
  6. Potential Disguises: Items like the piñata could be tools for blending in at parties or social events.

However, it’s essential to note that these are purely speculative interpretations based on a fictional scenario. In a real-world context, such deductions would be highly speculative and could lead to false conclusions. Personal items in someone’s home often reflect their tastes, memories, and everyday life rather than their profession, especially one as clandestine as an “international assassin.”

Me: Thanks for your help today!

Generative AI in the Legal Sphere

Well, that was definitely interesting! Then, naturally, I turned to ChatGPT to explore potential applications in the legal sector, and it had endless ideas. Here are but a few:

Generative AI, especially with its visual capabilities, offers a plethora of applications in the legal sector:

  • Document Review: Beyond just text, AI can now scan and interpret visual elements in legal documents, helping in tasks like verifying signatures or checking seals.
  • Evidence Analysis: AI can be used to enhance low-quality surveillance footage, identify faces in crowded scenes, or even analyze patterns in digital data visualizations.
  • Contract Visualization: Instead of lengthy text, visualize contract terms as diagrams or flowcharts, making them more digestible for clients.
  • Property Assessment: Using satellite imagery, AI can assess property boundaries, land usage, or even environmental compliance.
  • Accident Reconstructions: AI could recreate accident scenes in 3D based on photos and other evidence, aiding in personal injury or insurance cases.
  • Copyright Compliance: Scan vast repositories of images to identify potential copyright violations or unauthorized use of intellectual property.
  • Client Interactions: Use AI to generate visual explanations of complex legal scenarios, helping clients grasp intricate details through interactive diagrams or simulations.
  • Forensic Art: Create composites or age-progression photos in missing person cases or help juries visualize crime scenes.
  • Legal Research: AI can scan through visually rich documents like patents, identifying key diagrams or illustrations that could be crucial in patent litigation.”

Conclusion

With tools like ChatGPT expanding their capabilities, the intersection of AI and law is clear. ChatGPT’s new features highlight potential efficiencies and improvements for legal processes!