Running PyCon ZA 2020 with Big Blue Button and Discord
We settled on discord as the general chat / social platform fairly early. It has a lot of useful features, and the ease of creating channels, the various integrations (such as for registration) and the good quality inbuilt voice options make for an attractive package, and the cost was also right.
Also, a fair percentage of our attendees already had exposure to discord, so it was one less new thing to throw at people.
There was some concern about the fact that it isn't open source, but ultimately the convenience and simplicity factors won out here. It was also one less thing we'd have to manage infrastructure for.
Pre-conference testing and setup
We initially setup Big Blue Button on a test server one of the conference organisers had available, and used it for a couple of meetup style events and meetings - we did try to stress the server to some extent, but the testing was mostly about whether we could manage a reasonable workflow for running talks with BBB.
Sometime before the conference, we rented 3 significantly more powerful servers as the conference machines. The idea was to have a separate machine for each virtual conference room, and a spare that we could swap in easily if something went wrong. Initially, we planned to also offload processing the video recordings to a 4th machine, but we abandoned that due to cost considerations, and the time to setup the additional recording management, and instead setup BBB so it would do minimal processing of the recordings and we could later run the full processing pipeline after the conference.
We automated setting up the machines using ansible (the details are in the PSS-SA gitlab repo) and used the machines for several meetings. We did a fair amount of stress testing with robot clients to make sure we could carry the expected load.
Experiences during the conference
During the first keynote, we ran into significant network traffic issues, which was a surprise given the earlier stress testing, but probably inevitable given that the simulated clients could not match the behaviour of real people, especially the spikes from people repeatedly trying to reconnect. We were able to add several fixes during the conference, which helped significantly, but it did add some significant stress for the organisers.
In general Big Blue Button worked as expected. Most speakers were able to use the system without problems, and, other than the opening keynote, we never saw significant network issues on our servers.
One problem we did not adequately anticipate, and had no good workaround in place for, was when the speaker had internet issues. There were a few cases of local networking problems during the conference, and this this led to problems in a couple of talks. I'm not sure there is a good way of addressing this on the fly while still having live talks, as there are so many factors that can cause problems on a given day.
Another downside was that discussion on the talks ended up split between the Big Blue Button chat and the Discord server. It would have been better to integrate the two somehow, so conversations were all in one place.
After the conference
After the conference, we spent a fair bit of time discussing how to process the video. There were concerns about the names in the recorded chat, but we also felt we needed to keep the chat on the published recordings, as speakers sometimes addressed points raised in the chat as the talk went on. We ultimately settled for a fairly crude process to anonymise the chat, and ran with that.
While processing the video, we also hit issue 10223, which resulted in some bizarre error messages and failures. Fortunately, the fix turned out to be simple, and we were both able to process our videos and provide the information to fix the bug.
We did have to manually edit some of the videos. Notably, because of how Big Blue Button handles things, external videos played during a talk aren't included in the recording, so we had to patch those in manually. Fortunately, there were only a couple of cases were this was required.
Notes and takeaways
There are no major new insights, really, just repeats of lessons people have learnt in the past.
- Realistic stress testing of network services is hard.
- Having fast response to problems helps.
- BBB is quite good, although probably not a good fit for a larger conference