No flying pigs allowed
Whether it’s databases, AI, fake news, file systems or chamber music, computer scientist Margo Seltzer likes to connect the disparate — while keeping it grounded.
By Silvia Moreno-Garcia
You can almost see the ideas bounce around the room when you chat with Margo Seltzer. They zoom, back and forth on interesting tangents. Recently, the UBC computer scientist learned that part of teaching chamber music at UBC means showing musicians how to work effectively as a group. Now Seltzer is wondering how to apply such teaching techniques to computer scientists, who also frequently work in small groups.
“We have this image of a hacker in a dark room eating Cheetos. However, the reality is that computer science is highly collaborative, and that collaboration propels both people and ideas,” she explains.
“Chamber musicians communicate without a leader. They’ve come up with ways to teach that. Meanwhile, in computer science we ask people to do group work but we don’t have a formal way to teach them how to do it. We could learn from them.”
That, Seltzer says, is her superpower: connecting things that are not obvious, like music and computer science. And she’s had a long history of doing just that.
Seltzer joined UBC as the Canada 150 Research Chair in Computer Systems and the Cheriton Family Chair in Computer Science last fall, after spending 26 years at Harvard. In 1997 she co-founded Sleepycat Software, a database software company, which was eventually sold to Oracle. She served as a faculty adviser for the Women in Computer Science group at Harvard and was a director of USENIX: The Advanced Computing Systems Association.
At UBC, Seltzer has launched the Computer Systems Laboratory, which is interested in issues of data quality, machine learning, and systems and storage.
“People say ‘It’s all about artificial intelligence, that’s what matters.’ But you need systems to run that AI. You can’t run an AI algorithm on the carpet,” she says with a smile.
This very much ties to Seltzer’s research philosophy. She likes to keep her feet on the ground. Her work is about problems found in reality, everyday questions which require solutions.
“When it comes to ‘If pigs could fly’ questions, I say ‘Show me the pigs,” she says, though that doesn’t mean she’s not up to solving some audacious problems.
Digital needles in virtual haystacks
Seltzer has a long-time passion for making information easier to find, something which is becoming increasingly difficult. In the old days, people would place documents in filing cabinets and folders. There were a limited number of these cabinets. But now we have so much data on so many different devices that it can become impossible to find the thing we need. What was that file named? Is it in our inbox? Is it in Dropbox or Google Drive or some mysterious cloud server? Was it a PDF or a JPG?
The Internet, she says, is better at finding things than PC file systems because, since their inception, search engines looked at relationships between objects and leveraged them. Our file systems don’t do that.
“Let’s say I send you an e-mail with an attachment and you save it. There’s relationship between the e-mail and the attachment that the file system doesn’t necessarily track. There’s no way for you to look at the file and see where it came from,” Seltzer explains.
Seltzer’s hypothesis is that there are several types of relationships computers could leverage. First are the relationships that we’re already used to, like maintaining certain documents in certain folders. The second are relationships that a computer system observes but doesn’t capture — the e-mail message with the attachment is an example. The third category are relationships we don’t know about and the system doesn’t track — at least for now. One example might be temporal relationships: You had 12 tabs open when you were writing a letter of recommendation. You found an interesting article in one of those tabs. What was the article? If you only remember that you found the information when you were typing a letter (but can’t recall what it was exactly), you may never stumble onto the article again.
The good news, Seltzer says, is that finding things is something machines are relatively good at. A computer can look for a word in a document much faster than a human. But we still need to develop better ways of locating this information through a simple interface, and to do so in novel ways. Humans, after all, don’t necessarily think in terms of folders, nor do they necessarily organize their lives depending on what device they are holding in their hand at the moment.
A forgery test for the digital age
Seltzer has also recently become interested in another timely topic — identifying fake news. Seltzer’s team is developing a tool which would allow users to produce a ‘provenance graph’ showing the author of a story, where it was published, its sources, and how it relates to other articles. If her team is able to analyze a large number of news stories, researchers may be able to look at the structure of an individual article and determine whether it’s a fake news story. It’s something akin to pinpointing a forged work of art.
“We’re trying to develop some defensive strategies to help people navigate this brave new world. Our hypothesis is the structure of the graph for a fake news story will look quite different from stories from more legitimate sources,” she says.
Seltzer is also interested in low level systems questions, and is designing a new operating system. She loves interdisciplinary work and the thrill of cross-pollination.
“I have the attention span of a mosquito,” she confesses with a chuckle. “I’ve never met a research question I didn’t like.”