*The Limitations of ChatGPT for Complex Tasks: A Case Study in Multifamily Underwriting*
As the AI-powered chatbot, ChatGPT, continues to gain popularity, I've seen numerous recommendations to use it for financial modeling and underwriting tasks. However, after spending a month testing it for multifamily underwriting, I'm compelled to share my findings and caution against relying solely on ChatGPT for complex tasks.
*The Fragmented Output*
When I provided ChatGPT with a multifamily underwriting dataset, including rent rolls, T12s, operating statements, and asked it to build models, the output was disappointing. The chatbot generated isolated formulas, a cash flow table, and a cap rate calculation, but failed to tie these components together into a coherent and usable workbook. After 15 rounds of prompting, the time spent was equivalent to building the model from scratch in Excel, with the added burden of debugging the chatbot's hallucinations in cell D47.
*The Statelessness of ChatGPT*
The primary issue with ChatGPT is its inability to maintain state across a complex, multi-step task. Each prompt is treated as a fresh conversation, even within the same thread, resulting in fragmented output. This is particularly problematic for underwriting models, where assumptions feed cash flows, which feed returns, which feed sensitivities. The coherence required across these layers is lost, making it challenging to produce a reliable and usable output.
*Purpose-Built Tools: A Different Approach*
In contrast, purpose-built AI tools for underwriting are designed with a different architecture. They decompose the task into manageable components, run autonomously for 15 to 30 minutes, and check intermediate outputs before returning a complete workbook with actual Excel formulas. This is not a matter of model quality; it's a design philosophy difference.
*Conclusion*
While ChatGPT is suitable for quick questions, brainstorming, and general knowledge, it's not a reliable tool for producing complex, multi-step outputs like underwriting models. For tasks where the output is the deliverable, purpose-built AI tools are a more effective and efficient choice. As we continue to explore the capabilities of AI, it's essential to recognize the limitations of different tools and choose the right architecture for the job.