My Ugly is not Your Ugly

Elizabeth Zagroba

09 December 2024

Ugly code can mean lots of things: code someone else wrote, code with inconsistent white space placement, code that takes up too many lines, code that requires too many command + clicks to parse, code that looks redundant. I was struggling with the last of these recently. To me, some code a colleague wrote looked redundant. And vice versa, they thought code I wrote was unnecessarily duplicated.

They’d built an API client for our new (weeks-old) test automation suite. It had everything the API client had in the old suite: an endpoint for each HTTP verb, error handling, headers and data abstracted, logging the request and response. Plus it had an assert on the status code. It used a library to make the API calls.

I’d written my test without using my colleague's API client. I’d called a different library directly. This library logged about ~30 lines of request and response when the test failed, little enough that I didn’t have to scroll too much. The expected line appeared in green and the actual line appeared in red, among the grey request and response in my IDE’s syntax highlighting. By comparison, my colleague's API client logged the request and response for success and failure, but the library it called logged ~200 lines of output with four colors.

To me, using the API client was ugly. It was extra clicks to see what it was doing from the test, it was extra code in the code base, and it was extra output lines to scroll through and understand when my test failed. To me, it looked like extra.

But to my colleague, my code looked like extra. For every API call, I set up the headers, and for POST calls the data too. I had a line that verified the status code, and a line that converted the data in the response into a JSON object. To their developer brain, lines of repeated code should be abstracted away.

For my tester brain, when I see a test fail and the output tells me the failure happened in an abstraction, I want to scream. I want the IDE to point me to the exact line of the failure. I don’t just want to know that something went wrong with an API call, I want to know which one in the context of the test so I can properly diagnose the issue without scrolling through piles of logs.

Is this fish uglier than your code?

I wanted to remove the API client completely. I started the conversation in Slack with screenshot comparisons of the different libraries outputs. I had a conversation with a couple people who'd been reviewing my code. They seemed fine to merge my couple of tests without the API client, but still wary about how ugly the code was.

I sought an expert opinion from my colleague Arjan Blok, who has deep experience in test automation and no experience in our weeks-old code base or the application it was testing. He asked a clarifying question that pulled me back to using API client: "What would be different in each call?" It's true that I didn't need the headers and data setup to be visible in the test. They could take up part of one line instead of 5-10.

Arjan also pointed out the tests weren't all following the arrange-act-assert pattern. The tests using my colleague's API client had the asserts in the test. This is what looked ugly to me, not the idea of an API client itself!

I set up one POST call in a new method of the API client and returned the response body. I pulled the code up at standup with the whole group to explain why I was back on board with the API client idea, but interested in seeing it develop a bit differently. The other tester and myself were able to explain to the developers in the group why it's valuable to have the assert in a test. The developers in the group had a chance to explain why the asserts in the tests felt like duplicates.

Being specific about what was ugly to each of us helped us come to a compromise about the API client. We're keeping the headers and data abstracted (so my tests need to change) but moving the asserts (so my colleague's tests need to change). We seemed to all agree to write at the same level of abstraction going forward, but of course only time will tell!

For the book club I run in our R&D department, we’re currently reading The Programmer’s Brain by Felienne Hermans. It tackles what makes code "readable" or "ugly": what’s in short term vs. long term memory of your brain, how you read code, what makes for higher cognitive load, and how to get past the point of looking up syntax on Stack Overflow all the time.

I find it disorienting to click through to different abstraction layers in a new code base in a new language. The same for scrolling through a bunch of test output, not having the assert in the test -- I feel lost. For my colleague who wrote the API client, scrolling through a bunch of tests with lots of lines they didn't write was disorienting. They couldn't get an overview of what all was being tested.

I guess the moral of this story is: tell your colleagues what about their code is ugly, and show your work, even if it's ugly. Maybe they'll both change.

Photo by Bobby Mc Leod on Unsplash

No Vehicles in the Park

Elizabeth Zagroba

19 October 2024

Origin story

I found the website No Vehicles in the Park through Mastodon. It asks you a series of questions describing a situation that you have to judge. For example: "Matthew pilots a commercial airliner over the park. Does this violate the rule?" It's a silly, fun time for dorks who like those kinds of things.

What it looks like when you land on the website

Testers are exactly those types of dorks. We love reading about Matthew flying a plane over the park and want to ask: what's the airspace of the park? How do we define a park? How do we define something being within or outside of the boundaries of a park? Has the plane's engine fallen out?

If you've read Exploring Requirements by Jerry Weinberg, you're familiar with the "Mary had a little lamb heuristic", where each emphasis and variation opens up a new vector of possibilities:

Mary had a little lamb - Who else had a little lamb? What else did Mary have?
Mary had a little lamb - Why doesn't Mary have the lamb anymore? Are Mary and the lamb ok?
Mary had a little lamb - Did Mary used to have more lambs? What happened to Mary's flock of lambs?
Mary had a little lamb - Does Mary also have a big lamb? Is this Mary's small lamb or is the lamb underweight?
Mary had a little lamb - Does Mary have snakes?

I knew No Vehicles in the Park had the same kind of energy, and it would be fun to bring it to a group.

Online at Friends of Good Software

The knowledgeable goofballs at the Friends of Good Software Conference were subjected to this thought experiment in March of 2024. Here are the notes from that 45-minute session.

Notes from the Friends of Good Software session, where many great ideas are born and nurtured

I shared my screen and pulled up a mindmap on the Miro board where we decide our sessions and take notes during the open space. When I asked "what is a vehicle?" the group started listing examples rather than providing a definition. Pasting the results from image searches made sure we each understood the thing being described.

Once we felt we had "enough" (spoilers: it was not enough), we moved on to the 27 questions. I kept the website with the question in one browser tab on one side of my screen, the Miro board with the notes in a different browser tab on the other side of my screen, and the video with the participants on my other monitor.

For each question, I read it out loud and asked "does this violate the rule?" Participants held their thumb up for "it violates the rule", thumb down for "it does not violate the rule", and sideways for still deciding, conflicted, or annoyed by the question. Usually the vote was enough to spur someone to verbally defend their position. I watched the body language of the participants to assess if people had been swayed. After a minute or two of discussion, I double-checked the vote, declared my intention about which button I would press out loud to confirm I'd counted thumbs correctly, and pressed it.

The session and the debrief provided us an opportunity to reflect on the rule "No vehicles in the park" and our reaction to it. Some of our notes include questions like: is violating the rule allowed? do you want it to happen regardless of the rule? what's the desired behavior vs. prohibiting undesired behavior? We had a discussion about the cultural (and individual personality) differences when you prohibit something vs. allow something.

This may have been the hardest I've laughed during a Friends of Good Software session, which I've been co-organizing for five years. That someone proclaimed "Shoes are not the master of us!" within the first few minutes gives you some idea of how silly and off-the-rails this session felt overall.

In-person at Hungarian Software Testing Forum

Last week I ran the session again in-person at the Hungarian Software Testing Forum (HUSTEF). As part of the program committee, I was tasked with identifying "fun" activities for the group. I regret to inform you this is my idea of fun.

I wasn't sure what the tech or chair setup would be in the room before I arrived, so I envisioned me reading out the questions with people running from side-to-side depending on whether they agreed or disagreed with the question. We ended up in the largest room with hundreds of chairs attached to the floor, so that wasn't going to work. Plus we were doing it at 6pm. People had fizzy drinks in their hands and were ready to sit and enjoy them.

We agreed to stay seated and use our thumbs. I figured out how to connect my laptop to the screen on stage so everyone could read the site. I started by asking for definitions of “vehicle” and “park”, and wrote those down, until it felt like enough. Of course it wasn’t enough!

I read each question out loud and immediately got a thumb vote: thumbs up for this is a vehicle, thumbs down for this is not a vehicle, sideways thumb for any other feelings. (Phrasing it as "is a vehicle" and "is not a vehicle" proved much clearer for participants.) I’d call on someone at random, or who put their thumb up soonest, to ask them to explain their choice. I toggled back and forth between the definitions and the

I kept things moving quite a bit. (My style got compared to that of an auctioneer.) After an explanation or two, I’d take a revote if people seemed to have changed their minds, or announce which button I was going to click to give people the chance to veto.

We briefly debriefed with a conversation about requirements, but mostly people were happy to get back to the other more social events for the evening. The thing that people wanted as we went along was a continuum, of "most vehicle-y" to "least vehicle-y" where we'd put each item. We kept asking "wait, what did we say about the last thing like this?" Here are the notes and a couple photographs captured of the group.

A collection of testers truly struggling to synthesize their past decisions into the current one

Thumb vote: up for it is a vehicle, down for it is not a vehicle, some still conflicted

What's the point?

I don't think there has to be a point to something silly and fun. Silly and fun can be enough.

But if you're interested in there being a point, here are some options:

Tie-in to specifying requirements: This is originally what I had in mind when I saw it, but ultimately the silliness and "my brain feels broken" aspects were larger for both groups relatively late in the day.
Practice writing a user story: Start with the definition of "vehicle", "park", "in", as many as you think you need. As you go through the questions, do not list any examples, but do edit and add to your definitions so they capture each new use case.
Practice facilitating a refinement meeting: As a frequent facilitator of refinement meetings, I've cultivated the skill of getting people to vote on a thing (typically story points), discuss it "enough" for now, and move on. I'd recommend running this session to anyone needing a bit more practice in cutting people off, bringing people back to focus, or typing while screen sharing.
Practice or compare-and-contrast note-taking styles: Something I'd like to try for a future session is to use this as note-taking practice. Seed different invididuals with ideas about how to take notes: pro/con style two-sided list, timeline or continuum, bullet points, mindmap, etc. Don't take any notes yourself as the facilitator, or let people focus on which button you're clicking. Share your screen only to keep everyone on the same questions at the same time. For the debrief, have all the people seeded with the same note-taking idea compare notes first, then split up the groups so everyone's debriefing with a note-taker in a different style.
The original intent of the website designer: When you get to the end of the 27 questions, there are a few paragraphs describing why this website was built in the first place. I won't spoil it for you, but it might be an interesting way to spark a certain type of discussion with a group of people who might not normally self-select into such a topic.

Thank you

Thanks for all the people at FroGSConf and HUSTEF for joining the No Vehicles in the Park sessions. Special thanks to the HUSTEF organizers, who promoted this social event without fully understanding what was going to happen!

Let me know if you've run a session with a group, what the outcome was, and what nonsense scenario is playing out as a thought experiment now that you've done this.

Post by @ez@chaos.social

View on Mastodon

Exploratory Testing

Elizabeth Zagroba

17 August 2024

What is exploratory testing?

There is always more that can be tested than you have time for. A tester’s mission is to best choose where and how to spend their time.

wandering - purpose = lost
wandering + purpose = exploring
exploring + judgment = exploratory testing

Exploratory testing allows the tester to balance risk (the consequences of a failure) and coverage (observing all possible behaviors). It brings test design and execution together: you use the information you’ve gathered so far to immediately change what you’re going to do next.

How does exploratory testing fit in with automated tests?

The code in automated tests tell you what’s expected (at the time the test was written, by the person who wrote it). The output from the automated tests tells you what you got (at the time the test was run, to the person paying attention to the output).

Automated tests can’t do the evaluation work to tell you if the difference between what you expected and what you got is risky. They can’t answer questions like:

Did we build the right thing?
Has what we expected changed?
How does it all fit together?
If what we got has changed, is that a problem for the customer?
What didn’t we think of?

Valuable testing includes both automated and exploratory testing.

How do you do exploratory testing?

Testers keep these things in mind as they’re exploring an application. (Skilled exploratory testers can describe their thinking to their team later, or even as they’re doing it.)

1. A basis for comparison

When you find something you don’t expect, you’ll need a way to explain to other people why you don’t expect it. In deciding whether something is unexpected, you might find yourself referring to:

the history of the product
the image of the product and organization
claims made about the product by marketing, sales, documentation, conversations, user stories, etc.
user’s expectations
other behavior within the product itself
the product’s purpose
statutes and standards (legal requirements, SLAs, accessibility standards)

2. Rules of thumb and checklists

Having a list of ideas of things have gone wrong on software in general can help you identify similar patterns in your own product. They won’t prevent the unexpected, but having some ideas at your fingertips may help you uncover unexpected things earlier.

3. Deciding what to focus on

Setting a mission for your exploratory testing helps you decide what’s in and out of scope, and what you’re trying to accomplish so you don’t get lost. (Exploring is wandering with a purpose.) Try writing down:

where you’re exploring (the test environment, the new feature, etc.)
what you’re using/how you’re exploring (the automated tests, the logs, the accessibility scanning browser extension, etc.)
what question you want to answer (are the existing tests passing, are we logging at the right level, has the extension uncovered new issues, etc.)

Some examples of directions for your mission:

What’s hard about exploratory testing?

It’s not just poking around! It can be hard to describe why, when, and how to do it. Keeping all these things I’ve listed in mind all at the same time takes practice. It’s hard to know when you’re done, or if you’ve done enough. And it’s usually best to do both exploring and automating, so finding time can be tricky or hard to advocate for. Having the brainpower to be actively learning the whole time you’re doing your work is hard.

All the links in one place

James Lyndsay's video about Wicked Problems
James Lyndsay's video about what's delivered vs. what's expected
Michael Bolton's blog post about "desirable consistencies between related things" summarized by FEW HICCUPS
Katrina's Clokie's description of heuristics and oracles
James Bach's Heuristic Test Strategy Model
Karen Johnson's heuristic for regression testing RCRCRC
Chris Kenst's blog post on exploratory testing charters
Elisabeth Hendrickson's book Explore It! Reduce Risk and Increase Confidence with Exploratory Testing
Michael Kelly's blog post on the touring heuristic
Updated version of the Test Heuristics Cheat Sheet available from the Ministry of Testing
Simon Tomes's diagrams for describing exploratory testing available from the Ministry of Testing
Maaret Pyhäjärvi's article on self-management in exploratory testing available from the Ministry of Testing
Michael Bolton's blog post about when to stop testing
James Lyndsay's article on why you've got to do both automation and human-powered testing
Maaret Pyhäjärvi's article on exploratory testing for programmers

Find My Friends of Good Software

Elizabeth Zagroba

18 June 2024

We had a Friends of Good Software (FroGS) remote lean coffee last week. It's a structured conversation that gives people a chance to write down their topics and vote on them, both to choose the order of the topics at the start, and to decide if the current timebox is enough time on the topic. Timeboxes get shorter and shorter to keep the ideas and the blood flowing.

We gather our FroGS quarterly for an online open space or lean coffee. All our events abide by the four laws and one principle of open space, the hardest of which always seems to be: Whoever comes is the right people. Partly from a grammatical point of view, but mostly from an "I wish X person could have been here for this conversation" point of view.

As with each event, it was clear this time too that those who were there were the right people.

Someone wondered aloud: how do I get my developers interested in testing? A fellow Friend of Good Software replied with what felt like a completely unreplicable personal anecdote: bully your developer into presenting at a developer conference, so they'll meet a bunch of kind-hearted testers enthusiastic enough to inspire the developers' interest. Then someone from the other breakout room relayed a similar anecdote in the hangout later: bully a family member into joining you to a testing conference, and let said family member learn enough things from enthusiastic testers to break into testing.
Someone noted: I would love a talk about font accessibility right now. To which another one of our FroGS replied: I have given a talk on font accessibility, I'll send it to you.
One person asked: tell me about your experiences of having a guest in your mob or ensemble. In fact, I had been exactly such a person, and the ensemble facilitator was also among the handful of people in our breakout room.

Yes, of course, we did "cover" "topics" too. This lean coffee included:

using agile practices on fixed-cost projects
how to set up a test strategy for a product that's never had a test strategy
increasing visibility/recognition for testing activities
how to keep curious about things you've done before
what you're learning now

Our FroGS brought what's currently on their minds, got helpful tips and suggestions, and came away with notes on the Miro board for later.

But the things I remember are those moments of seredipity, the things that feel like they can only happen by accident or with great care, with the right people in the room. Whoever comes is the right people.

Don't Call It A Bug

Elizabeth Zagroba

21 November 2023

The situation

In 2015, I was on a large team working to skin the pages of a Drupal content management system (CMS). I'd tested a Django CMS before, but Django is built on Python. Drupal runs on PHP. Every error page I triggered was an exciting new adventure of digging into what the problem might be, and which of the ~15 developers I should bring the problem to. All but one of them hadn't worked with Drupal before either.

Half of the team worked out of the office in Brooklyn, New York. The other half of the team worked out of the office in Bogotá, Colombia. Our main coordination meeting with the whole team was the standup meeting in the morning. Each group piled into a conference room. Most of us would get a chance to yell towards the microphone across the long table, strain to hear our colleagues in the other hemisphere doing the same. The unlucky ones only got to do the latter, until another project kicked us out of the meeting room.

Imagine how well we communicated and trusted each other in such an environment.

The perspectives

Imagine now, you're a developer on the team in Colombia. Elizabeth in New York has found a problem, and she thinks it's a bug. She found it testing your story, but you don't know enough about the system to know if it's even your problem. There are 14 other developers on the team, and any one of them could have caused it.

Imagine now, you're my boss, in charge of testers on several different projects around the company. You're trying to look at Jira to get some insight into my work. You do a search for "Bugs created by Elizabeth in the past two weeks."¹ You are surprised to discover that Elizabeth seemingly hasn't found any bugs in the past two weeks.

The conversation, and my perspective

My boss came to talk to me. They asked why I hadn't found any bugs. Surely with a project as late and over-budget as ours, it must be ripe with bugs?

I had found bugs. But I didn't mark them as type Bug in Jira. I filed them as feature requests.

I'd noticed that any Bug I filed came with hours of back-and-forth about whether it was a bug, whose problem it was, and a fight about whether the application behavior should change at all. Any Feature Request I filed was eagerly picked up and built in the time a Bug fight would have taken.

The issues themselves were phrased nearly identically:

Bug report I didn't file: When I go to the detail page from the list view, I get a 500 error. Instead, I should get the details that correspond to the title I saw in the list view.
Feature request I did file: When I go to the detail page from the list view, I should get the details that correspond to the title I saw in the list view.

Same idea. Same code change. Different issues type in Jira. Why?

Feature requests had story points. A developer who implemented a feature request had created something we needed where it wasn't before. At the end of the sprint, the number of points delivered by each developer could be tabulated in Jira. (I don't believe this was tied to compensation, but measureable outputs in Jira -- as evidenced by my boss's inquiry -- did seem to be a social currency at the company.)

Bugs did not have story points. A developer fixing a bug would have completed fewer story points at the end of the sprint. They would also have to scream into a conference room microphone at standup the next day about how they couldn't pick up any new story because they were fixing a bug of their own making.

My boss reacted to my explanation by simply switching their Jira filter from "Bugs created by Elizabeth in the past two weeks" to "Feature requests created by Elizabeth in the past two weeks."

Lessons learned

With the power of hindsight, there's a lot more I could have dug into about a culture that uses numbers from Jira as currency. Regardless, I do think this experience made me a better, more collaborative tester focused not on getting the credit/being right/finding the most bugs, but on getting the application fixed.

I like to think back on my time at this company as my fastest learning experience. But when I was in the thick of it, all I knew I had learned from this experience is:

Jira does not tell the whole story.
Approaching a situation with curiosity or excitement will get you a better outcome than approaching it as a fight.
Getting the bug fixed is more important than labelling it as a bug.
Keeping the lines of communication open to be able to deliver a message might be more important than any particular message.

Other lessons reinforced these ideas for me. I learned from Black Box Software Testing that (paraphrasing) an effective tester is someone who gets bugs fixed. Liz Keogh's Agile Testing Days keynote in 2018 on how to tell people they failed (and make them feel great) argued in favor of positive reinforcement. In digging that link up, I found this old post of mine included a bit on keeping lines of communication open.

What political games have you played in order to get things done? When have you sacrificed credit or acknowledgement for progress? What's a worse metric to track in Jira to pit developers against each other than story points?

The way to get things done is not to mind who gets the credit of doing them.
~ Some dude

Bug, or feature request?
Photo by Timothy Dykes on Unsplash

At the time, I could have dictated the exact JQL (Jira query language) you'd need for this advanced search filter. I have nothing but gratitude for the brain space that forgetting this minutae has freed up. ↩

Stop Talking About the Work and Just Do It.

Elizabeth Zagroba

24 September 2023

I'd responded to an email to the whole company. "Let me know if you're interesting in contributing in any way" it said. I was interested.

The context

There was a mentorship program was part of the Girls Can Code Africa project. It connected mentees to mentors, and it was being sponsored by our corporate overlords.

A few weeks after my initial email, I received an Outlook invitation for an hour-long meeting. I showed up.

The meeting

After an introduction nearly as brief as the one above, a variety of unintroduced fellow meeting attendees started listing features for the application they wanted to build. A tech stack and architecture was declared. Goals were assumed. Scope exploded as everyone coalesced towards a solution they'd seen work in other situations before: filling a backlog with user stories to capture requirements.

But nobody was capturing these verbal requirements. I took a handful of notes in my notebook before I realized that everyone was going to need them. I switched to a Dropbox Paper document and shared my screen¹.

For the features people described, I would have estimated a full-time team of three people could get it all done in 6-8 weeks.

This is when I switched from patiently listening to asking questions to clarify my understanding:

Do we have any time constraints or a particular deadline?
Are we tied to a particular architecture?

I wasn't expecting either of the answers I received.

We did have notable time constraints that would completely shape the way the project: we needed to have something to show the corporate overlords and another, higher-profile stakeholder at a meeting in less than two weeks. We didn't have a team to work on this. Most others joined this meeting just as I had. They were kind, curious volunteers with nothing in common.

We were tied to the architecture. But in asking about it, I found out why: my company wanted to showcase how quickly they could build a product with their own platform. We wanted to impress the stakeholders more than the mentors or the mentees with this one. I could build a Microsoft Form that covered 80-90% of what we'd scoped down, but that wouldn't meet this goal for the final product².

That's when I had to bring our heads out of the clouds.

There is no way we can build all these features with the given amount of time and people power. What are the most important things to build first?

I acknowledged in a straightforward way that the scope needed to be cut, and redirected the conversation to what was still unclear. As a group, we spent some time weighing the pros and cons of delivering features in a particular order. It helped everyone become more comfortable with the idea that we could deliver what we promised sooner, but we would promise less than...well, everything.

Spontaneous ensemble

We were about 40 minutes into the meeting when everybody was ready to part ways, have one person write a set of user stories to populate the backlog, and then gather together again in a day or two to review and refine the backlog before continuing with any other work.

To me, that seemed like a big gap in time if we only had 10 days to deliver. People were about to start saying their goodbyes and waving at each other over the video call when I interrupted with one last question, by far my favorite:

Could we start building this together right now? What if we spent the next 20 minutes getting as far as we can in setting up this app?

That turned this meeting around. We immediately went from having questions we could only guess the answers to ("How much time do you really have to work on it this week? How many of these features can we deliver in 10 days?") to gaining more information about the answers.

We spun up a blank application. We started setting up the domain model. People who had expertise on a topic chimed in when we were going in an untenable direction. We didn't have a working app in 20 minutes, but we set the stage for groups of people to have a common understanding about what needed to be built. Stakeholders who hadn't seen development work with their own eyes before had a glimpse into the nitty-gritty steps of starting from scratch on a project.

Way of working

I couldn't completely change the way of working of a hodge podge of people I had never met and had no prior relationships with in the course of a 60-minute meeting. We still ended up writing a backlog and assigning user stories. When we gathered at the end of the first week to review what had been done so far, most of the stories were half-done and very few completely finished.

But we had a clear understanding of what had been done, what was left, what was most important, and confidence that we could achieve our top priorities even with our tight deadline. Our constraints gave us clarity and focus, and the ensemble helped us stop talking about the work and start doing it.

Where can you do less talking about the work and more doing of the work? Have you been stuck in a refinement meeting where you wonder if it would be faster to complete the story while you have all the right people gathered together in that (virtual) room? What are you going to do about it next time you see this happening?

I find it much more effective for everyone together to provide a double-check of how I've understood what's been said in real-time instead of waiting until they read them later individually. ↩
I did build a Microsoft Form in 10 minutes during the meeting. I wanted both a backup (in case the app couldn't be shown for some reason) and a reference point (so the developers knew what we were working towards). I dropped a link to it in the chat during the meeting, but it didn't generate much discussion. ↩

Anarchist Software Architecture

Elizabeth Zagroba

10 April 2023

An Assertive Tester at work sent me a direct message declaring that the two of us should decide which repository the tests should go into. They'd decided we were "the deciders" here since they saw themselves as the highest-ranking tester in their department, as I am in mine.

For most technologies, it makes sense to host the tests in the same repository as the code. For a variety of very good reasons I'll go into below, the API and browser-level tests for our apps are all hosted in one big, shared repository. Older browser tests were using Selenium, but some teams had started to move towards using Playwright.

The Assertive Tester wanted to decide if new Playwright tests should join the existing tests in the big, shared repository with the other tests, or exist in their own git repositories until the apps themselves could be moved to git. They set a meeting with me for a few days later.

Invite your comrades

I wasn't in the unit where most of these teams using the big, shared repo were. I didn't know their day-to-day struggles, or even if the tests in the repo were still being run.

I reminded the Assertive Tester that I wasn't an expert in everyone's context, and asked them if it would be all right if I invited the people who write tests for other teams. The Assertive Tester agreed, if reluctantly.

Ask them for their context

I posted in the Slack channel we have for the members of the big shared testing repo, inviting them to the gathering in a few days. In the thread beneath the message (mistake here, I should have made the more visible message in the channel include all my questions), I asked them to thread their response to a few questions, whether or not they could join the conversation. The questions were:

1. Is your project hosted in our legacy version-control system or gitlab?
2. What are you using for browser automation now?
3. Is collaboration across teams important for your testing framework?

The day before the gathering, I followed up in the Slack channels of invididual teams to find out about their browser tests.

Give them sufficient context

Then I started a document. Initially, it was to collect the responses to those three questions that were now scattered across a few channels. I collected them in an orderly table. I later added a couple sections at the top to make sure the gathering could be contained within the hour as scheduled. I listed what I knew about the situation for the first few years that the big shared testing repo was used:

1. Tests couldn't be stored in the same place as our apps.
2. Code for login, API tokens, things all our apps would need were maintained by one team.
3. We all worked in the same unit, sat near each other, and collaborated in an ensemble to write browser automation code together.

Gather to discuss

The Assertive Tester kicked off the meeting before handing it to me give a brief history. I gave a summary of the history and questions for our consideration. Then we went around to each participant to confirm the details in the table and find out if there was more to the story. The discussion led us to two more salient points in our history:

4. We required merge requests to be approved by the one maintainer, and later one of four maintainers.
5. We wanted people to learn Python as part of their skill set.

And several more points for our consideration:

6. The Assertive Tester found the .gitlab-ci.yml for the big shared testing repo was too complicated when it contains both Selenium and Playwright. (Finally, I discovered the motivating reason for this discussion!)
7. Teams like being autonomous! The team (not just the tester) should own the test automation code.
8. Do we want to teach/enforce how to write tests in a certain way? Would it help people to onboard and quickly use a more shared testing framework?
9. The big shared testing repo has too many files to navigate easily. It's hard to convince developers to use it.

Only point 6 reflects the earlier point in the discussion I captured before a conclusion was reached: we don't need to optimize for people moving between teams because it happens so rarely.

Looking at the table of who was using what:

some teams had their app code still in our legacy version-control system
some of those teams had abandoned the big shared testing repo and had their own gitlab repos for their Playwright tests
some teams were using frameworks other than Selenium or Playwright for their tests (we added a section to capture those pros and cons)
some teams weren't testing things in a browser

No two teams of the nine total were in the same situation. With most of the history for one big shared repo no longer applicable, the desire for developer collaboration within teams over tester collaboration across teams, and a relatively easy setup time for authentication, we decided each team should host their own tests. And when possible, put them in the same git repo as the application code.

Write things down

With the exception of point 6, the rest of these points above are captured as facts. I appointed myself note-taker and screen-sharer for this conversation. I wrote down the questions we discussed as they were introduced, and overwrote them with the answers as we decided on them. At the bottom of the document (though I should have put it at the top), I added a Decision section. It's very short: one sentence and three bullet points linking to repositories. The sentence is: Let's not share a repo for all of the tests.

Captured in this way, anyone could glance through this document, understood what we talked about, determine who was involved in the decision-making, and see how we came to the conclusions we did without having to watch the whole Zoom recording of the meeting.

Why am I telling you this? Because I discovered this kind of high-level decision across teams has a name: architecture. I made an architectural decision about the tests in teams around the company! Well, not exactly that, even better: I brought together the forces that made the decision become clear.

At BoosterConf recently, I met Andrew Harmel-Law and saw his talk called "A Commune in the Ivory Tower: A New Approach to Architecture." He is literally writing the book on anarchist decision-making in software. (He came to my open-space session about job titles after I introduced myself as an organizational anarchist. Hurrah for outside-the-box labels!)

You should really watch his whole talk. It was how I discovered I was doing architecture, but in a decentralized, non-blocking, pulling, parallelizable fashion that didn't require a consensus. The document I create to prepare for and capture notes for that meeting serves as an architectural decision record for the big shared testing repo. I consulted the affected parties and offered advice (that teams can happily ignore) without getting in the way of their work.

Andrew also happened upon this process himself as a way of not being the architect-as-bottleneck on projects. As these things go, he also discovered that this had already been discovered by Ruth Malan.

That's the whole story for today. I'm planning on reading more about anarchy and thinking about how it can influence my work, or possibly already does. For now, go forth and become anarchist architects!

Online Open Space Technology

Elizabeth Zagroba

19 March 2023

I'm one of the organizers of the Friends of Good Software Conference, or as we like to call it: FroGS Conf. We've settled into a routine with our event schedule: two whole-day unconference per year on Saturdays, plus two hour-and-a-half lean coffees on weekday afternoons.

We've got participants who can only come on the weekend, or only come on a weekday, or can only come in a Americas-friendly time zone. We try to accomodate everyone, but not all at the same time. My fellow organizers and I are busy enough that it would be difficult to run events more often.

We've continued to FroGS as an online event because our participants are so global. The most important "technology" we use for the event is open space technology, which means that we organizers create a structure for everyone to hold sessions and share their ideas. You can read more about that at frogsconf.nl.

For our software tools, we've been using Welo, Miro, and Slack for all our events, and just recently replaced Google Forms and Sheets with MailerLite.

Welo

We've been using Welo as our video-conferencing tool. One of the previous organizers discovered the tool, kindly asked the founders if we could try it out, and we've been using the same goofy set of rooms by the river ever since. It lacks some of the finer audio filtering features of Zoom, but makes up for it in the visual representation of the space. You can see who's in which breakout room, so you can find people you want to talk to.

Welo video conferencing software

Miro

Miro is our whiteboarding tool. We've got a bunch of different frames that fall into two categories: locked frames with explanations the organizers have prepared ahead of time, and empty frames for participants to contribute to.

Our locked frames

Our locked frames have explanations about: the code of conduct, the laws of an unconference, how Welo and Miro work, why the rooms are named the way they are, which rooms are rooms for regular sessions vs. hallway/breakout essions.

Our empty frames

Introduce yourself

We've got a frame for participants to introduce themselves. Forget someone's name or need their LinkedIn? There'll be a little card with their name, face, and contact info to help you out. We'd tried networking or get-to-know-you type activities to start the event in the past, but 10am is not the ideal time for that. We let the Miro board speak for us while we're still drinking our coffee or tea.

Marketplace Topic Queue

We've got a marketplace topic queue that we use at the start of the morning and the afternoon. This is where participants (and organizers!) write down their session proposal, with a title and a category. During the marketplace, they get to briefly pitch their session to entice others to join them. Once they've done that, they drag their stickie onto the...

Session frames

Our session frames double as a schedule for the day, and a place to take notes within each session. The organizers have set the structure of the day - how long the sessions last, how many there are - but the participants are the lifeblood of the event: bringing topics, running sessions, and taking digital notes on the Miro board.

Other whole-day frames

We've got frames for announcements and kudos, to help participants and organizers call our particular changes or high-fives that need to be distributed before the retro. We use the retro frame at the end of the day, though we encourage people to capture feedback for us throughout the day as it occurs to them. We close the day with a brief reflection, discussion, and hang out until the last of the stragglers need to sign off.

Slack

Our Slack is very quiet. We mostly use it amongst the organizers to coordinate our activities in between events. The chat feature in Welo is perhaps not optimal, so Slack can serve as a slightly less ephemeral/confusing place to type things to each other during the event.

MailerLite

We just added this tool to our toolbelt while preparing for our most recent event. We'd wanted to email past participants to advertise upcoming editions, but struggled with the Gmail limit of sending 50 emails at a time. We didn't want to be marked as spam. We also wanted to give past participants a straightforward way to unsubscribe from being bothered by us ever again, for both quality of life and GDPR reasons. Once our free trial with MailerLite ends, we'll pay a small monthly fee to maintain our email lists we'd previously cobbled together in a variety of Google spreadsheets.

Our lean coffee is a very small version of the big unconference day. We split up into groups small enough to foster a conversation, cut out the whole marketplace aspect, and just vote on items in the topic queue to drive the discussion. The Miro board's a lot simpler, but the rest of the tools are the same.

Thanks to my fellow organizers Huib Schoots, Sanne Visser, and especially Joep Schuurkes since he suggested I overcome my recent writer's block by writing about this topic. Reach out to one of us if you're interested in setting up your own unconference, lean coffee, or other type of structured-yet-unstructured event.

Recapping My Year for a Performance Review

Elizabeth Zagroba

26 February 2023

We've got annual performance reviews where I work. That shouldn't be the only time I receive feedback about how I'm doing. But the company budgets are set once a year, and thus the consequential "am I getting a raise or a promotion?" conversation typically happens once a year.

Due to a power vacuum in my department, my boss is also responsible for an R&D department comprised of hundreds of people. He isn't focused on my day-to-day, and certainly doesn't remember what I accomplished a year ago. I barely do.

To prepare for my performance review, I wanted to present him with a clear picture of where I've been focusing my efforts. I have three places where I could see what I'd done, but only two were reviewable at a glance.

My focus board

With my amorphous floating-around-the-department Quality Lead role, I ask myself "what's the best thing to focus on?" a few times a week. I keep a Trello board of possible topics. It helps me remember what I wanted to start, pulls me back to what's important when I've been interrupted, and ensures that I communicate back to whoever's affected when I finish something. It also keeps me honest: it prevents me from having too much in progress work at the same time.

The Trello board has five columns, from left to right:

Done in {this month}
In Progress
This Week {with the dates, including whether I've got any days off that week}
Next Week
Backlog

Tasks have a title and a tag (either a team or theme), broken down small enough that I can complete them within a few days. "Organize the conference" would be too big, but "draft Outlook invitation, identify and email all departments" with a conference tag would be small enough.

My goal is to keep the In Progress column down to one. Most times there's one thing in there I'm waiting to hear back on and one thing I can actively work on. At the end of the month, I take the whole Done column and move it to a completely separate Done Trello board. This way I can keep the information around without having to look at it all the time.

It was my Done Trello board I reviewed for my performance review. At a glance, I could see that much of my work focused on a shared testing repository, a side project, and helping out three particular teams.

My calendar

My calendar also gave me an overview of my effort for the year. Leadership book club, 1-on-1 coaching conversations, and knowledge-sharing sessions took small, incremental work every week or two. The work was typically too small to put on my Trello board, but still visible from the meeting titles as I paged through the weekly view of my calendar.

My notebook

Jerry Weinberg's Becoming A Technical Leader got me in the habit of journaling for a few minutes at the end of my workday. I hadn't spent time to summarize, group, or even re-read these journal entries along the way. I could have spent a lot of time reading through all my journal entries, but doing so wouldn't add much to what ended up being the ~7 minutes I had to recap my year to my boss.

In the end, nothing my boss said in my performance review was a surprise to me, which is just as it should be. I was able to remind him about some of the harder-to-see code review and 1-on-1 coaching. My boss was also able to bring to light something I couldn't or didn't see: my holistic way of thinking about our products, our department, and our company had influenced the way other people were thinking. People weren't just staying in their lane, performing their prescribed duties; they were thinking more about what all would be required to solve a problem, and how they could help.

How do you compile a summary of your year at work? Do you collect everything at performance review time, keep a brag document of all your accomplishments as you go along, or go next-level and update your resume every time you've succeeded? How do you capture the things that you invested a lot in that didn't go as anticipated?

Test Automation Guidelines

Elizabeth Zagroba

15 January 2023

I received a large merge request recently, with thousands of lines of code changed and a handful of bullet points explaining some of how the test automation framework had changed. "It's already been reviewed from a technical perspective," the author of the code said. "I'd like you to review it from a test automation perspective."

I'd spent a few hours dipping into the code and deciding how to approach this review. This "test automation perspective" charter came from the conversation I decided to start with the author, before leaving any written comments on the merge request. I was looking to focus my efforts in a fruitful, productive direction, where my suggestions would likely be heeded. But what were the test automation principles I should be evaluating against? What did I expect from the setup of a test automation framework? Which criteria was I using to evaluate this merge request in front of me?

The Automation in Testing TRIMS heuristic came first to my mind:

Targeted
Reliable
Informative
Maintainable
Speedy

But there were other things I was noticing in reading the tests that made me question and identify my assumptions. I realized I needed to write down my assumptions. I wanted to come to a common understanding with the other testers in my department, rather than making decisions case-by-case, line-by-line with this author.

And thus, the Test Automation Guidelines for my department were born. Or rather, compiled from years of working on test automation code and hearing greater automators than I am write and speak about the topic.

Test Automation Guidelines

Entire repository

Test automation code must be version-controlled and linted.
Each function or method should do one thing and one thing only. Group functions and methods that do similar things together. “Separation of concerns” is usually how this is described.
Code comments shouldn't duplicate information the code can provide. They should describe why the code is the way it is, and be used sparingly.
The README should contain information on setting people up who are new to the repository to run the tests, and information about code style and contribution guidelines.

Individual automated tests

To automate a test or not to automate a test

Tests should be automated to the extent that the effort in writing and maintaining them is less (or less frustrating) than testing the same thing through exploration.
Automated tests contain an assert or verify. Assertions are better when they are checking something unique (an id, a name, etc.).
If you're using automation to expedite exploratory testing and not decide something on its own, make that clear.
Each test should test one thing and one thing only.

Readability and ownership

Tests should be readable. The best way to make sure you are not the only person who can read them is to pair to write them. The next best way is through code review. Smaller branches are easier to review and merge than bigger ones.
Automated tests are owned by the whole team.
Automated test output should be readable. You should be able to tell from the output what the test was trying to do, and how far it got.

Determinism

Don’t trust an automated test you haven’t seen fail. If you can’t change something about the test to make it fail, maybe it’s not testing what you think it’s testing.
Automated tests should provide information we care about. A big pile of green tests only helps us if they’re testing something relevant.
A failing (or even worse, sometimes failing) automated test should be a problem. Invest the time to make it deterministic, or delete it. Run the tests and publish the results so failing tests matter.

Resources

João Proença’s talk: “Should we just... delete it?!”
Joep Schuurkes’s blog post: Test automation - five questions leading to five heuristics
Joep Schuurkes’s blog post: How this tester writes code

Another department at my company had collected the TRIMS heuristic and a few other pointers (automated at the lowest level, use static analysis tools, etc.) that I linked my colleagues to rather than rewriting. Outfitted with my guidelines and theirs, I was able to go from a conversation-as-code-review deeper into the line-by-line-is-this-what-we-want-here code review.

I encouraged the author to identify and write down their own approach to the code after our conversation. They had strong opinions about what should go in a page object, when it made sense to break a test into more than one, and how the output should read when a test failed. By writing those preferences down, I could evaluate whether they were being applied consistently. Everybody needs an editor.

Do you have guidelines you refer to when reviewing test automation code? Beyond the links I provided above, is there some other reference you'd point me to? When do you find yourself bending or questioning the guidelines you thought you held dear?