Beyond Code Generation: How AI Enhanced Our Load Testing Workflow and Documentation

Take Away: At Tag1, we believe in proving AI within our own work before recommending it to clients. This post is part of our AI Applied series, where team members share real stories of how they're using Artificial Intelligence and the insights and lessons they learn along the way. Here, Tag1 Founding Partner and CEO, Jeremy Andrews, explores how, with the right setup and workflow, AI can act as a reliable development partner for load testing: consistent and detail-oriented in ways that improve outcomes for engineers and clients alike!

Overview

When a prominent government client approached us with performance issues on their Drupal website, it presented a perfect opportunity to apply our AI collaboration approach to a new area, writing load tests with Goose, our open-source load testing framework.

Tag1’s engineering team has adopted AI methodically to enhance development and delivery processes across projects, focusing on where it adds real value rather than chasing trends or shortcuts. The scope of this project was complex enough to be meaningful (a variety of different page types, authenticated user scenarios, realistic multi-page journeys, and validation for both performance and content correctness at scale) and provided a good opportunity to demonstrate our ever-evolving AI-assisted development approach.

When working with client code and data, we use Tag1's internal AI infrastructure, which hosts AI models on our own secure cloud platform. These internally hosted models rather than third-party AI services, ensuring the information never leaves our control. This approach gives us the AI capabilities we need while maintaining the security and compliance standards that government and enterprise clients require.

While AI didn't immediately speed up development, it delivered measurable improvements in code quality, consistency, and knowledge transfer. Our experience using AI as a development partner for load testing highlighted its strengths in pattern recognition and systematic implementation, but also reinforced the critical need for structured workflows and continuous human validation. This deliberate approach ultimately allowed us to build reusable foundations for future projects while refining our AI-assisted development practices along the way.

Enhancing Load Testing through AI Collaboration

Load testing is critical for identifying performance bottlenecks before they impact users or cause website outages. Through the numerous performance audits we conduct each year, we see that most organizations only realize they need load testing when performance issues become business-critical problems. By implementing it proactively, you can stress test code changes, perform rapid regression testing, and understand how modifications affect performance. The ability to run identical tests repeatedly makes it invaluable for measuring real impact on your code and infrastructure.

Typically, this kind of work involves a day of manually analyzing page structures, writing individual transactions, testing each one, and iterating based on results. As we've been integrating AI into all processes, this project gave us a chance to demonstrate how well our approach handles the detailed, methodical work that load testing requires. The goal wasn't just to deliver solid load tests, but to refine our established processes for AI-assisted technical work and create reusable foundations for future projects.

Setting Up for AI Success

We leveraged Cline, an AI coding assistant that maintains a "memory bank" of context files. We've found that spending time upfront, carefully building this memory bank, pays off throughout the development process. The memory bank includes several types of context files:

Project brief: What we were trying to accomplish and any constraints;
System patterns: How the project works and our coding approach;
Technical context: Framework details, dependencies, and environment setup;
Active context: Where we were in the process and what came next.

A critical part of this setup was establishing clear rules in a .clinerules file. These weren't just coding guidelines, they included instructions for how we'd work together on this project, written from the AI's perspective:


## Critical Implementation Notes
### Collaborative Implementation Workflow
**Security Constraint**: AI cannot access secrets or page sources directly.
**Established Process**:
1. **You provide**: Page source HTML for each template type from performance environment
2. **I implement**: Transaction function with structural validation for successful page load detection  
3. **You test**: Run transaction against secured environment, then share the built-in Goose logging
4. **We iterate**: Refine validation based on test results

This collaborative workflow proved effective in practice. Since the AI couldn't access the secured pages directly, our developer would provide it with the HTML from those pages, and it would figure out which elements to check to confirm a page loaded properly. These checks look at page structure rather than specific content, so they keep working even when the actual text changes. This was particularly useful for posting complex forms. Instead of having to catalog every field and validation element, the AI could quickly identify what mattered.

Where AI Helped: Consistent Implementation

Once the team got into a rhythm and settled on the proper prompts, the AI became quite effective at building validation logic from the provided HTML snippets. It would recognize when something needed multiple steps (like search pages that lead to results pages) and then apply that same pattern for similar situations. After walking it through the first few transactions, it was able to handle new ones consistently without requiring re-explanation of the approach.

But this success only happened because we tested everything as progress was made. Each transaction that was reviewed and tested helped the team understand what AI excels at (recognizing patterns and staying consistent) while maintaining human oversight to prevent over-engineering and ensure everything functioned correctly.

Learning AI's Blind Spots

While the collaboration worked well overall, we had to learn what the AI's default behaviors were and build guardrails around the problematic ones.

Sometimes the AI would make the change we asked for but also "improve" unrelated code in the same file, renaming variables for consistency, duplicating functions with minor changes, or adding error handling where it wasn't needed. This created unnecessary churn and made it harder to see what actually changed. At times the AI wanted to over-engineer solutions, suggesting adding configuration layers or abstraction patterns, when a simple direct approach was most appropriate. And predictably, if you've done LLM-guided coding before, when the AI suggested adding a new dependency it would often try and use an older version of the Rust crate, maybe 6 months to a year behind current releases.

The solution for all of these was updating the memory bank files and `.clinerules` with better directions, and suggesting that Cline query the Context7 MCP server for up-to-date crate information. That way we didn't have to repeat ourselves or re-explain the same issues. Once we told it to update the context files with the new constraints, it followed those rules consistently going forward.

Solving the Authentication Problem

The next challenge came when adding static asset loading to the tests. We had started with a simple proof-of-concept approach: just adding basic authentication headers to each individual page request. This worked fine at first, and we prefer to keep things simple until there's a reason to add complexity.

When we started using the Goose Eggs validate_and_load_static_assets() helper function to make the tests act more like real browsers, those static assets kept failing to load with 401 Unauthorized errors. The problem was that this helper function makes its own separate HTTP requests for static assets, and those requests didn't include our authentication headers.

Rather than stopping to dig through the documentation, we described the problem to the AI as we would to a junior developer. It leveraged the context we had provided about Goose when we built the initial memory bank, and quickly figured out the right approach: use client-level authentication with Goose's set_client_builder() method, called once per GooseUser:


async fn setup_basic_auth_client(user: &mut GooseUser) -> TransactionResult {
   let mut headers = HeaderMap::new();
   // Get credentials from environment variables
   let username = env::var("BASIC_AUTH_USERNAME").expect("BASIC_AUTH_USERNAME not set");
   let password = env::var("BASIC_AUTH_PASSWORD").expect("BASIC_AUTH_PASSWORD not set");


   let credentials = format!("{}:{}", username, password);
   let encoded = base64::engine::general_purpose::STANDARD.encode(credentials.as_bytes());
   let auth_value = format!("Basic {}", encoded);
   headers.insert(AUTHORIZATION, HeaderValue::from_str(&auth_value).unwrap());
   let builder = Client::builder()
       .user_agent(APP_USER_AGENT)
       .default_headers(headers)
       .cookie_store(true);
   user.set_client_builder(builder).await?;
   Ok(())
}

This way every request automatically had authentication headers, including requests for static elements. It saved us having to context switch and read through docs, instead letting us stay focused on the bigger picture.

Documentation That Actually Stays Current

The best part of this approach was probably the documentation. The AI didn't just write the load tests; it kept updating the README and other project context files as we went. Every time we added something new, it would automatically update the documentation to match.

The README it created was complete and useful. It covered everything you'd need: a project overview, how to set up authentication, running different test scenarios, and interpreting results. It even added references to external documentation like the Goose Book and API documentation. It enforced a process within Cline to keep the documentation current as the code changed.

This addresses something developers generally struggle with: Performance audits often have tight deadlines and a lot that needs to get done. In the rush to get everything working, documentation either gets written early and becomes outdated, or gets thrown together at the end. The AI treated documentation as part of writing the load tests itself, updating it with each change without slowing anything down.

This solves a real problem: good project documentation means anyone can jump in and be productive right away. Whether it's other Tag1 engineers or client teams taking over, they can easily understand and modify the tests. The AI-maintained documentation creates significant value by ensuring comprehensive, current documentation that would otherwise be difficult to maintain under typical project constraints.

Measuring the Impact

The implementation took about a day and a half, compared to our usual half-day to full-day timeline for similar projects. (We only billed the client for the standard timeline, considering the extra time our own investment into refining our AI-assisted processes.) The client got thorough load testing with much better documentation than we typically have time to provide under tight deadlines and we created reusable context files that make subsequent Goose load tests much faster to write. We also help our clients adopt AI strategically, and the structured workflows from projects like this give them a foundation for building their own AI-assisted practices.

This work sets us up well for future load testing projects. The Goose-specific memory bank and workflows we developed here mean we won't need to rebuild that foundation each time. And getting comprehensive documentation automatically solves a real problem. It's usually the first casualty when project timelines get compressed, but now it just happens as part of the development process.

What Made This Work

The key wasn't the AI writing code for us. It was the AI serving as an intelligent collaborator that could:

Maintain context and apply patterns consistently. The memory bank system kept all project knowledge organized, and once we established architectural patterns, the AI applied them uniformly across all transactions.
Handle tedious but important detail work. Converting HTML analysis into validation rules, maintaining documentation, and ensuring naming consistency throughout.
Enable rapid iteration toward better solutions. When our initial proof-of-concept request-level authentication approach hit limitations, the AI quickly helped reimplement it as a cleaner client-level solution.
Build on existing expertise rather than replace it. The AI enhanced our established Goose framework and load testing methodologies instead of trying to reinvent proven approaches.

Looking Forward

These AI-enhanced workflows represent how we've evolved our practice: building on nearly two decades of technical expertise as a leading consulting company, while responsibly integrating new tools that amplify our capabilities. Rather than chasing "AI-first" approaches, we're finding the most value in coupling AI with proven frameworks like Goose and Drupal, and with the best-practice performance optimization methodologies we've developed over time. This approach doesn't just make individual projects better. It compounds our expertise, making each subsequent project faster and more thorough than the last.

We look forward to future performance audits. Now that the Goose-specific memory bank files exist, we get all the benefits we experienced this time and can work considerably faster since we won't need to recreate that foundational context.

As Goose continues to evolve and gain new features, we'll need to update our memory bank and refine our workflows to stay current. This ongoing maintenance is part of the process, but the collaborative workflow we've established makes it straightforward to incorporate new Goose capabilities and scale to larger, more complex load testing projects.

The future of load testing isn't just about generating more traffic. It's about creating comprehensive, maintainable test suites that our clients can use and improve independently. The complete, current documentation from this AI-assisted approach is a significant step toward that goal.

Putting It All Together

The real advantage comes from creating systems and documentation that grow more valuable over time. Clear workflows let AI preserve knowledge, reduce redundant work, and strengthen future projects.

This post is part of Tag1’s AI Applied series, where we share how we're using AI inside our own work before bringing it to clients. Our goal is to be transparent about what works, what doesn’t, and what we are still figuring out, so that together, we can build a more practical, responsible path for AI adoption.

AI Applied

Beyond Code Generation: How AI Enhanced Our Load Testing Workflow and Documentation

Jeremy Andrews

Founding Partner/CEO

Overview

Enhancing Load Testing through AI Collaboration

Setting Up for AI Success

Where AI Helped: Consistent Implementation

Learning AI's Blind Spots

Solving the Authentication Problem

Documentation That Actually Stays Current

Measuring the Impact

What Made This Work

Looking Forward

Putting It All Together

Related Links:

More Migration Resources

Performance testing with Gander

Popular content

Popular blogs