Making the First Hybrid CHI in 2022

Published in

ACM SIGCHI

12 min readSep 17, 2022

In the months since CHI 2022, “Running a hybrid conference is essentially like running two conferences” was the echoing comment I’ve heard on hybrid conferences. The error here is if one designs two conferences, then they do not have one hybrid conference. For CHI 2022, our goal was to have one blended conference. As researchers in the field, it is easy to have the “what does hybrid mean” discussion and to retell accounts of that great hybrid conference you went to with less than 57 people. None of this matters at scale; assembling a handful of Zoom rooms and a Slack channel doesn’t work when you have hundreds of authors and thousands of attendees. A few people have asked us how did we build a hybrid conference, what worked and failed, and should be done differently; this article aims at detailing all of that and does step a bit into the technical weeds.

First off, we were thinking about boots on the ground when planning. Here we focused on a few core deliverables:

An architecture that utilizes SIGCHI infrastructure and that is reusable.
Video pixels need to be streamed out clean and clear.
People should have one place to go to consume content (locally or onsite).
Blended interactions are essential.

And while we want to do all we can, we must stick to delivering the minimal viable product (MVP) first when designing the experience. I’ll speak mostly to the design of these systems and leave out the gritty bits of the vendor discussions but will include some of that where needed.

Architecture

Having spent some years as SIGCHI’s VP of Operations, I was familiar with many of the systems in place. Stitching them together was possible but our SIGCHI Programs App would just have outbound links to the video stream, the slido for Q&A, the discord room for chat, the paper in the DL, and wherever else needed. In hindsight, I wish we did this, but we wanted to minimize the application windows and logins (trying to only use our registration system) to the best of our ability. The Programs App was not able to host embeds yet, which would have solved the task at hand. As an alternative, we picked a vendor (HUBB) which could host the page with the embeds and promised to deliver a WCAG 2.1 AA accessible experience while doing so. Lastly, this all has to be linked with presenters (remote and local), AV staff, and attendees (remote and local). In the end, we had the following flow:

A workflow that goes from the left to right with local/remote presenters to a mixing board which forks to a in room projector and a YouTube stream embedded in a Hubb page. Viewer entry point is the PWA on the right of the diagram. — Figure 1: CHI 2022 Hybrid Architecture

Here we see the audience can enter the experience using the SIGCHI Programs App (PWA) and navigate to Discord (for chatting with local and remote people) or get to the Sli.do for questions or to the livestream if they aren’t in the room. People presenting are all funneled through Zoom (no more plug in your laptop) which is projected to the room and streamed out via RTSP. If someone couldn’t make it (maybe they got sick or had technical issues), we had the pre-recorded video presentation for backup. These pre-recorded talk videos are also persisted and stored with the program (and are often reshared and viewed after the conference). This architecture was optimized for streaming pixels, reducing platforms, and hosting blended interactions. While some of this went great, not all of it went to plan.

Streaming Pixels

In many ways, this is the obvious core to any hybrid event. If remote people can’t consume the content in real time, then it’s just an asynchronous viewing party. While not uncommon to the Internet, live streams in our use case require a fair bit of infrastructure. First, we need a high quality feed so the slides are viewable. Second, we need that feed with closed captions delivered. Third, if someone is remote, they should be able to ask a question into the room. Finally, it would be nice to see the speaker, the audience, and the room. With regards to the MVP, that last point can be omitted if it has to be.

As we learned the core of this falls into the hands of the onsite AV vendor. You know these people; they are the ones who tape down microphone cables and set up booms at conferences you probably attended before the pandemic. They also set up cameras and mixing boards and manage streams. The connection to the Internet (and bandwidth therein) usually is under the jurisdiction of the convention center. AV and Internet account for the bulk of the added cost for hybrid; given that the SIGCHI EC did not allow for an increase in registration for 2022 (and CHI registration has increased in total by 40 USD in the last decade with no increase in the past 6 years), it is arduous to cut costs and add hybrid experiences. To go hybrid adds hundreds of thousands of dollars for each needed piece: AV and Internet, the virtual platform, and added project management costs. If you’re curious, official numbers will be posted over on SIGCHI’s site when finalized; also historical budgets are there already.

Referring to our architecture in Figure 1 above, if we have 4–5 speakers in a session, with two remotes we can set up a very complicated mixing board to read the HDMI feed from the slides in high def and spot a camera on the speaker with one or two people working the mixing board and the camera. For remote presenters, they will come in via a video conference tool anyhow (we used Zoom). So to simplify the rig, we had all attendees (local or remote) present through Zoom and that direct line was on the mixing board and broadcast to the room. The shuffle to change HDMI/DVI/VGA drops is replaced with presenting in Zoom and despite noting the change in pre-conference correspondence, presenters gave significant push back asking why can’t we change it back to the old way of just using a cable. A wide view stage camera takes the place of a camera operator and can capture some of the room and speakers. What’s happening is we are simplifying the configuration and human power needed to run this event (and saving costs as to have a full crew on site is prohibitively expensive).

Unfortunately staffing issues during the pandemic made it difficult to find trained AV experts. Being sandwiched by New Orleans Jazz Fest also didn’t help as the experts available were working (presumably) higher paying gigs. Many of the issues we faced on day one happened right at that mixing board: double audio streams, switching inputs between pre-recored video and Zoom, etc. In one session, I walked over to the mixing board to drop an audio level. We can’t blame the operators here; it’s a byproduct of recovering from a pandemic; still it was an issue we faced.

One Location for Consumption

The last thing we wanted was to hand out 3–5 URLs for each session for people to use for watching, chatting, asking questions, seeing who’s speaking, finding the paper in the DL, supplemental video and whatever else. The SIGCHI Programs App does a good job here for detailing the content and hosts several links under each session or paper. We wanted just a single landing for all of these interactions embedded so people didn’t need to go between platforms. The SIGCHI Programs App didn’t have the ability to handle the embeds at this point but that is and should be the solution for all future SIGCHI events. I’m working with the SIGCHI VP of Operations to make this path happen.

Now there’s a whole industry for this where we see Hubb (which we used this year), Delegate Connect (from last year), Midspace (formerly Clowder used at other SIGCHI events). I’ll personally go on record saying these all don’t meet the standards we set for CHI and SIGCHI. First of all, we have a superior program app. Second, none of them (really none of them) adhere to our accessibility standards (despite that they go on contract saying they do comply). Third, there’s a fair bit of effort duplication to stand up these sites once we have the SIGCHI Programs App in place as the program has to be replicated elsewhere. It’s a mess.

Unfortunately, the path was turbulent from the moment the vendor contract was signed. Much work had to be duplicated by the volunteers (migrating from the Programs App to HUBB). HUBB makes more logins for people. HUBB’s chat (for Q&A) was not accessible to our standards nor patchable so we had to (last minute) use Slido. HUBB’s navigation was overall poor compared to the SIGCHI Programs App. The list goes on. The “let’s have one platform” idea did not work.

But something more was happening here beyond one vendor not meeting our quality specifications. I realized by watching attendees, we don’t live in a one platform world. Stop right now and count how many tabs are open in your browsers (yes plural). We are used to juggling platforms. How many times does someone check Instagram or Twitter on their laptop versus their phone? Some people prefer to watch a video on their iPad and type questions into Slido on their phone. SIGCHI’s IMX conference grew out of research on second screen experiences. The community dissected the Hubb platform and started sharing Slido room numbers and the direct YouTube live stream links. Many of our attendees were sharing the Slido rooms with others on Discord (so much that we wrote a script to broadcast the Slido room numbers there). Others were sharing the direct Youtube stream links openly as the HUBB embed was prone to failure. This was great as we made the decision to just open up all the live streams to everyone (registered or not) so they were easily viewed. I know I was asking Slido questions from the Slido app on my phone with discord up on my iPad. It’s the Zero, One, Infinity rule in effect. One platform won’t work and people will want to consume content in native environments or just pick the one piece they want. Embrace the platforms and provide access to the atomic pieces.

Blended Interactions Are Essential

Being hybrid is not about segregating the online crowd into their own channel to discuss amongst themselves. Hybrid is about blending interactions. If you take live questions from the audience and have a volunteer in the room scanning an online chat room to find and surface remote questions, the remote attendees will basically be ignored. Here running all questions through Slido allowed everyone to put in a question and allowed the session chair to have a single feed to scan for questions to relay to the speaker. Now we will miss the occasional person who would take the mic and launch into a soliloquy on “why you didn’t cite my past work?” but personally I’m ok with that. What does get lost is people finding out who other people are. We could start a practice of people listing who they are in the question: “Why didn’t you cite my past work? (ayman • CHI22 TPC)” to help. You could probably try to enforce a one live question one remote question protocol but as any CHI organizer will tell you “enforcing anything at CHI is next to impossible”. Also there’s no promise that the presenter will be in the room anyhow since we now allow remote presenters.

Remote session chairs just didn’t work. Initially we didn’t want this feature but an equity argument was made and we didn’t have enough session chair volunteers. Having remote session chairs didn’t work because the first job of the chair is not to meet speakers and ask questions if there’s dead air after a talk. The session chair is to control the room (take the conn nautically speaking). This is impossible to do when remote. After day one of CHI we worked closely with our student volunteers and retrained the AV staff so things went smoother but in effect wasn’t perfect. I think remote chairs could work if we brief the SVs for those sessions in advance advising them how to take the conn. There is the cost of the added burden on the SVs to take on this role, but in situ at CHI 2022, many SVs did so to save the day. Thanks SVs you all are amazing!

Some venues just will need bifurcating: most notably Interactivity. Watching a livestream of other people using a demo is less compelling than just watching a polished video of the demo and chatting with the authors and builders. Once upon a time there was a CHI video theater where we’d have popcorn and watch video demos. Perhaps bringing that back in some community chat space for live discussion might be the solution. Which brings me to perhaps one of my favorite things about CHI 2022: Discord.

Initially I’ll admit I didn’t want a Discord server (as I was focused on fighting platform creep). When weeks before launch it turned out the chat rooms in Hubb were not WCAG-2.1 AA compliant we added Discord (lead by Monica Pereira, Danilo Gasques, and Kashyap Todi who did amazing work setting it up, vetting attendees, and writing bots). Discord became the mechanism where people on site and people who were remote connected, chatted, complained, praised, and troubleshooted. It was kinda cool to see how quickly the community bloomed there. Pretty much the few voices I heard complain about Discord onsite just rejected using it or logging in.

Now at this point, you might say ‘Why not just let anyone ask a question and put all the remote attendees in a livestream room and nothing more?’ This seems simple but ultimately won’t work. First, to interact with the sessions (even if just asking a question) one would have to be registered for the conference (even if cheap or free) as that puts the person under our code of conduct. Second, the livestream platform would have to only allow those registered people in which would create another point of login or new logins to create for people (sound familiar?). Lastly and most importantly, this effectively walls off the remote viewers away from the local in-person attendees. The goal should be to blend spaces and people, not segregate them. There is the issue of the attendee who is onsite and “can’t be bothered with Discord” to which I’d just say the world has changed and it’s their loss; they get the lesser experience by not joining the discussion.

What About Future CHIs?

The good news here is the SIGCHI Programs App is now equipped to handle embeds too, so future conferences can offer a mostly “one login” experience as well as provide the outbound links to the individual platform tools. Many of the pipelines in our architecture offer reusable and easily scriptable datastreams. I’m working with the current SIGCHI VP of Operations (Hi Kash!) to build out a flexible path here. Not pictured in the architecture above is the SIGCHI QOALA app which links to PCS (our paper review system) allows conference chairs to design the sessions and layout the conference. If QOALA creates the embeds, live streams, and q&a rooms programmatically, then hundreds of hours of volunteer time can be saved. For example, imagine a talk is set for Monday morning but during planning that talk is moved to Wednesday afternoon. All the links and rooms and embeds around that talk have to be updated (and yes we did this manually in 2022). If QOALA can stand up the embeds, it can keep track of schedule changes and update things automatically. This architecture would take a new simplified form as seen in Figure 2. This would require only two logins: an ACM login and a Discord login. Both we can ensure will adhere to our privacy standards and the Discord can be shut down after the conference.

In the early days of the pandemic, as SIGCHI VP of Operations, we worked hard to stand up Asynchronous Hybrid (with RecSys being the first on deck). Two years ago when we started planning, myself and Caroline Appert (the TPCs), and Simone Barbosa and Cliff Lampe (the GCs), and the rest of the CHI 2022 committee believed the path forward for hybrid is synchronous. With the successes and obstacles we faced in 2022, I still believe synchronous hybrid is the future. More so, by working with the EC on this solution, it allows all 24 SIGCHI conferences to follow the same mechanism and lets SIGCHI become a leading example for hybrid conferences with blended remote and local interactions.

Photo of a large conference room full of people looking at the screen — Photo by Claudio Schwarz on Unsplash