Cybersecurity at Proof

Allison Bishop
Proof Reading
Published in
25 min readOct 28, 2021

Section 1. Introduction

There are some words in the English language that have been forced to shoulder ridiculous burdens. One of these words is “secure.” Much has been written about ways to keep data “secure” and ways to make networks “secure”. It works as an adjective: “get your secure cloud products here!” It works as a verb: “secure your company’s data today!” And when it wants to work as a noun, it armors itself in the many letters of “cybersecurity.”

It’s not a mystery why we ask so much of the word “secure.” We want security to be a thing we can cleanly separate from the mess of our everyday decisions — delegated to experts and managed independently from our other goals. We want to have our cake, and eat it “securely” too.

This is a premise that cybersecurity tools explicitly embrace and cater to. Want to offload your data storage and still keep it “secure”? Use encryption! Want to have an internet that is open to all but still know who you’re communicating with? Use cryptographically signed certificates!

Indeed, the alphabet soup of cybersecurity tools (SSH, VPNs, AES, VPCs, SHA, …) can accomplish some pretty impressive feats. But if we lose our wonder for what precisely they accomplish and how fragile it is, we may begin to treat them as catch-all techniques that do all the hard work for us. This can be dangerous because security is never fully separable from our other actions and goals. What we want from a security perspective might be subtly (or not-so subtly) in conflict with what we want from a usability perspective. Fundamentally, security and usability exist jointly in a space of possible tradeoffs, and the choices we make for one have important consequences for the other. Language that is positioned as “do whatever you want, but securely” blinds us to these connections and dulls our understanding of important nuances. Even within the realm of security considerations, there are important tradeoffs. A design choice made to defend against one possible threat may introduce another: for example, mechanisms designed to retrieve and deploy software patches may themselves become a vulnerable entry point into a critical system.

As a result, we insist upon discussing cybersecurity measures with a high degree of specificity. What scenarios are we protecting against, and how? What kinds of resources and circumstances would it take to overcome our protections? Which kinds of threats are more likely than others? And ultimately, why do we feel the design decisions we’ve made represent a prudent and robust state in terms of cybersecurity?

Naturally, dealing in more specifics begs the question: but what are the threats you haven’t thought of? Obviously you can’t specifically reason about those. This is another reason why language that abstracts above specifics is attractive — it may give the impression that you’ve thought of everything. But thinking systematically to anticipate as many threats as possible is distinct from speaking in generalities. Good abstractions — the kind that help you mitigate and reason about categories of problems without having to think individually about every possible example — are still specific about the boundaries of the categories.

What you can expect from this document is a systematic overview of our approach to protecting our trading system and our clients’ data from cybersecurity threats. We will go into sufficient detail for technical readers to get a good grasp of our cybersecurity design. We will also provide sufficient background for non-technical readers to acquire a meaningful sense of what our security controls mean, what kind of threats they are designed to prevent, and how we believe they compare to common alternatives.

Organization of this document:

The next section of this document (Background) provides relevant context about underlying tools like hashing, encryption, and threat modeling that are certainly not unique to Proof, and may already be familiar to technical readers. The following three sections (Network Security at Proof, Code and Data Security at Proof, and Operational Security at Proof) describe the measures we have put in place to protect our digital assets and the reasoning behind them. In the final Cybersecurity FAQ section, we address some common questions.

Section 2. Background

There are several common tools that we make use of to help protect our trading system and other critical assets. Fundamentally, a trading system requires several different components which must share some kinds of data and communicate with each other, as well as with outside parties (e.g. upstream clients sending orders to trade and downstream trading venues like exchanges). In addition, data must be backed up and stored for our own use (e.g. analyzing the behavior of our trading algorithms, examining the logs of our system’s behavior, etc.) and for regulatory purposes. Hence tools that protect information while in transit and at rest are highly relevant to our use case, as well as tools that control access to software and system capabilities in a network setting. Here we describe the important features of these tools, including their known limitations.

Collision-resistant hash functions: Collision-resistant hash functions are a tool for detecting changes in data and software. The hash of a file (or any other object that can be expressed as a string of bits) is a short string that can be quickly computed from the file and is expected to change if anything in the file changes. Checking a hash is a highly efficient way of confirming that two files are the same, or that one file is the same as its previous version. But since the hash is considerably smaller (and hence easier to store and check) than the entire file, it is theoretically possible that the hash will be the same even when the file is different. This is not a design flaw in any particular hash function — it is an unavoidable consequence of the fact that there are more possible large files than there are possible small strings. Nonetheless, when hash functions are well-designed to be collision-resistant, it is infeasible for even a malicious party to find two differing files that have the same hash value. We know they exist, but actually computing them is believed to be so hard that it cannot be done with currently available computing resources. Hash function designs are standardized by NIST, and standardized designs like SHA256 (currently in use) tend to survive multiple decades of scrutiny before requiring upgrades. Historically, some hash function designs like MD5 have ultimately been broken (meaning that people found ways to compute different files with the same hashes). These breaks tend to follow a slow pattern of evolution over a period of months to years as researchers discover initial weaknesses that are gradually developed into exploits. Such a process tends to provide ample warning and time for implementations of hash functions to be upgraded. It is important to note that hashes are a powerful tool for data integrity, but do not on their own do anything to hide the data (i.e. they do not address data confidentiality).

Private Key Encryption: Encryption is a tool for protecting the contents of data from eavesdroppers, particularly while data is in transit or at rest. In private key encryption, there is a single secret key that can be used to encrypt or decrypt data. To someone who does not know the secret key, encrypted data will not reveal the underlying contents. However, meta-data such as the size of the contents may be revealed, as the size of the encrypted data is typically proportional to the size of the original (unencrypted) contents. Encryption also does not hide who is sending data to whom, and by what means. In particular, the fact that encryption is being used is noticeable. The most commonly used (and highly recommended) algorithm for private key encryption is AES (the “Advanced Encryption Standard”).

Public Key Encryption: Public key encryption protects the contents of data similarly to private key encryption, but instead of one secret key, there are two keys: one secret and one public. The public key is used to encrypt, while the secret key is used to decrypt. This differentiation of keys allows us to separate the capability to encrypt from the capability to decrypt. This is particularly useful for situations where one entity needs to receive encrypted contents from many different sources. Establishing a shared secret key for encryption and decryption with each source would be unwieldy and inefficient. A public key encryption algorithm instead allows the recipient to publish a single public key that all senders can use to encrypt data/messages to that recipient. Crucially, the decryption key remains secret and is never shared with anyone. This way, senders cannot read any of the messages that are encrypted to the recipient — they are only enabled to encrypt, not decrypt. This handy functionality does come at a computational cost — public key encryption algorithms use larger keys and are slower to compute compared to private key ones. As a result, hybrid encryption is often used — meaning that public key encryption is used to communicate a fresh, shared secret key that can then be used for private key encryption throughout the remaining communication session.

The most commonly used public key encryption algorithms are RSA and algorithms based on elliptic curves. Like collision-resistant hash functions, algorithms for both private and public key encryption are exposed to decades of public vetting and scrutiny, and are standardized by NIST. Any weaknesses found in the standards tend to develop into exploits over a slow timeline of years, providing ample time for standards to be upgraded and re-implemented.

Public Key Infrastructure, Digital Signatures, and Certificates: While hashing and encryption are powerful tools for protecting the content of communications from threats to confidentiality and integrity, they do not address questions of authentication. Encrypting data only protects your data if you encrypt it to the right recipient. If a malicious party can somehow convince you to encrypt it to keys that they know, then the data will still fall into the wrong hands. For this reason, public key infrastructure (mechanisms for registering and verifying public keys) is a core and necessary tool.

The mathematical heart of public key infrastructure is something called a digital signature scheme. A digital signature scheme allows an entity to generate a pair of keys, one public and one private, and use their private key to sign messages in such a way that anyone can verify the authenticity of the signature using the public key. This combined with a hard-coded root of trust can be used to generate reliable public key infrastructure. This works in the following way: public keys for a small number of trusted entities can be hard-coded in a machine. When a communication takes place with an entity that is not one of these trusted roots, the entity can be asked to provide a signed certificate that links its claimed identity to its claimed public key. If such a certificate is valid and authentically signed by one of the trusted roots, then the claimed public key is accepted and used for encrypting the communication to the intended recipient. Continuing chains of signatures can optionally be supported: e.g. a certificate signed by a non-root entity might be accepted, if that non-root entity has a certificate signed by a root entity, and so on.

The mathematics that underlies commonly used digital signature schemes is very similar to the mathematical underpinnings of public key encryption algorithms, and there are several standardized options available.

IP addresses: When machines communicate with each other in a network, they use IP addresses to address traffic. The unit of traffic is a “packet,” and packets typically need to be routed through several hops before arriving at their final destination. Routing tables are used to direct how packets will move from place to place in between their origin and final destination. IP addresses are part of the Internet Protocol (a standard for internet networking), and in IPv4, they are split into public IP addresses and private IP addresses. (There is a newer version of the protocol, IPv6, but this has not been widely adopted.) Public IP addresses can be used by anyone to address traffic over the open internet. Private IP addresses, however, can be used for communication within a small private network and will not provide a way for arbitrary other machines to communicate into the network.

It should be noted that the protocol IPv4 itself contains no mechanism for enforcing the correctness of IP addresses. Somewhat like physical addresses in the US postal system, anyone can put any return address they like. However, any mail sent back to that return address will be routed correctly, not returned to the duplicitous sender. In addition, there can be other safeguards at other layers of the networking stack. For instance, it may be known that traffic coming in from a particular source should have a particular originating IP address or set of addresses and this can be checked. In the case of private IP addresses, machines outside of the address space simply aren’t part of the same postal system, so the ability to “spoof” a conforming return address does not translate into the ability to communicate into the private system.

NAT gateway: A Network Address Translation gateway, or NAT gateway, is used to allow a private network using private IP addresses to communicate with machines on the public internet when it chooses, without allowing public machines to initiate communications into the private network. The NAT gateway uses a public IP address for itself, and forwards communication from the private network to a recipient address on the public internet. It then receives responses to its own IP addresses and forwards them back into the private network. The NAT gateway will not accept attempts to communicate that are initiated from the public internet.

SSH, TLS, and IPsec: The Secure Shell Protocol (SSH), Transport Layer Security (TLS), and Internet Protocol Security (IPsec) are protocols that allow a machine to communicate with another machine over an insecure network, with strong protections in place for authenticating the identities of the communicants, protecting the confidentiality of the communication from others (e.g. eavesdroppers), and protecting the integrity of the communication (e.g. detecting any corruption or manipulation in the packets being transmitted). This is accomplished by combining many of the tools we have detailed above: digital signatures for authentication, encryption for data confidentiality, and hashes for data integrity.

These various protocols have many similarities, but operate at different layers of the networking stack. SSH operates at the highest application layer and is typically used to login to a machine remotely and run commands. TLS operates at the lower transport layer, and IPsec operates at the even lower networking layer.

Virtual Private Network (VPN): a virtual private network (VPN) is a combination of protocols and configurations that allows a remote machine to connect into a private network, or allows multiple physical networks to connect to each other. This will typically use tools like IPsec or TLS to provide authentication and security protections.

Virtual Private Cloud (VPC): Cloud providers like Amazon Web Services (AWS) manage vast arrays of physical machines and networking infrastructure that customers can use to deploy software and scale their deployments dynamically. These deployments can be configured to have public IP addresses and be reachable over the open internet, but this is not the only option. Virtual private clouds are deployments of resources configured to behave as private networks using private IP addresses. Access to resources in a VPC can be tightly controlled, and NAT gateways can be used to allow a VPC to initiate communications with the public internet without requiring the VPC to be exposed to incoming connections.

Threat-modeling frameworks (NIST): We also take inspiration from well-known threat-modeling frameworks, though these are philosophical and organizational tools rather than pieces of technology. In particular, we employ the philosophy of the NIST framework that decomposes cybersecurity preparedness and response into five categories: identify, protect, detect, respond, and recover. In the design, operation, and evolution of our security controls, we seek to: identify threats systematically, protect against threats with appropriate tools, detect any violations of our policies or signs of unusual activity, respond swiftly and effectively to any detected problems, and recover smoothly after taking responsive actions. Finally, we endeavor to stay informed and up-to-date on cybersecurity best practices, emerging threats, and emerging tools.

Section 3. Network Security at Proof

The current nature of US equities trading necessitates a complex network of connections between participants. As an agency broker dealer, Proof must build and maintain a trading system that is configured to communicate with many parties: OMS and EMS providers who forward client orders, technology vendors who provide services like real-time market data feeds, and providers of market access. In addition, Proof employees must be able to interact with the trading system as functionally required to do their jobs. The network architecture that hosts Proof’s trading system and the various means of egress and ingress represent critical assets that must be tightly controlled and protected from unintended uses, both malicious and accidental.

Threat Model and Design Principles: From a cybersecurity perspective, the gravest threats to our trading system are unauthorized/unintended access, as well as interception or corruption of our communications in transit. This is not to dismiss other concerns. Naturally, authorized accesses may have worrisome consequences if there are bugs in our software, fat finger orders, or misunderstandings between our software and our vendors, etc. But these situations are intended to be addressed by our risk management controls, which are detailed in a separate document. So here we will focus on the threats of unauthorized access as well as threats to data confidentiality and integrity during live system operations.

In designing our network architecture, we adhere to the principle of “least privilege”: each user or component of our system should be endowed with the weakest capabilities that are sufficient to perform their assigned function. This avoids the introduction of unnecessary attack surfaces. We also separate the components of our system into separate networks to create additional opportunities for control mechanisms that stand between an actor and a given resource. Overall, we incorporate conservative, redundant, and multi-factor controls, thereby mitigating the potential impact of a single control failure as well as limiting the potential for damage from activities that fall within authorized access patterns.

To identify and protect against threats to our network architecture, we take an inventory of the possible points of access to our system. Every access point represents a potential threat surface for unauthorized access, and for each we consider what measures are in place to protect and monitor it. Every time different components of our system communicate, this represents a potential threat surface for data to be observed, intercepted, or corrupted. For each communication, we draw upon the tools detailed above to provide strong authentication, confidentiality, and integrity controls.

Our Network Architecture and Internal Access Controls: Physically, our trading system runs on machines maintained by AWS. The physical security of these machines is provided by AWS, which employs a broad and deep set of controls and monitoring mechanisms for their data centers (see https://aws.amazon.com/compliance/data-center/controls/). Frankly, this high level of physical security for data centers is table stakes for cloud providers as well as financial firms by this point, and is unlikely to be a significant differentiator between high-grade systems.

Logically, our trading system consists of three separate VPCs. There is a management VPC which represents the single point of entry for us to interact with our system in real time from the outside. There is a trading VPC which runs the trading logic and processes market data. There is also a web VPC that runs the GUI that we use for internal support.

Each VPC uses a distinct space of private IP addresses. Communication across the VPCs is limited to what is necessary for the various components to function. In particular, the trading VPC and the web VPC do not communicate with each other directly. Instead, an opensource piece of software called Redis runs in the management VPC and can communicate with both the web VPC and the trading VPC. Up to date information about trading activity is conveyed to the web VPC through Redis. The allowable communications between servers in the virtual private cloud environments is proscribed via security groups. Security groups are a mechanism provided by AWS for expressing what sources/destinations of traffic are allowed to be inbound/outbound at each server. A single security group corresponds to a set of “allow” rules — all traffic is disallowed by default, so traffic that is allowed represents the exception and must be proscribed explicitly. A server can be a part of multiple security groups, in which case the traffic it is allowed is the union of the allowed traffic across the security groups it belongs to. Communication between servers across Proof’s VPCs occurs through IPsec to protect data confidentiality and integrity over transit.

Another mechanism we use to tighten access and communication between our resources in AWS is IAM, which stands for “Identity and Access Management.” This is a mechanism that AWS provides for assigning roles and permissions to machines and applications as well as to users. Having a specified role can allow a virtual machine or application to access certain resources (e.g. the parameter store in AWS). These roles are again designed using the principle of least privilege and assigned to resources in our system at deployment time. In our system, we make the effort to tighten permissions down so that resources have specific access (down to the API level) to only what they need. For example, if a EC2 instance needs to pull s3://foo/bar/baz.json, then we allow s3:getObject on s3://foo/bar/baz.json, and not s3:* on s3://foo/* . [Note: the *’s here represent wildcards that would allow access to a potentially wider set of resources.]

Proof Employee Access:

Proof employees connect to the management VPC through a VPN endpoint. The root of trust for the authentication process is a public key corresponding to a private key that was generated and is secured by Proof’s Chief Technology Officer (CTO). Each Proof employee generates their own public key and private key pair and obtains a certificate for their public key signed by the CTO. These certificates are used to authenticate employees and allow them to connect to the VPN. Private keys are kept locally on employee’s machines and are never transmitted or shared.

The ability to log into a particular machine inside Proof’s VPCs requires an additional step. In addition to those used for VPN certificates, Proof employees generate separate public/private key pairs that enable ssh logins. An employee cannot log into machines like those in the trading system VPC unless his or her public key has been explicitly added to a list of authorized public keys for that machine. With credential forwarding in ssh, such logins are accomplished without the private keys ever leaving the employee’s local machine. The list of authorized public keys for the trading VPC currently only contains two keys: one for the CTO and one for the Chief Software Architect. This list of authorized public keys is refreshed every 12 hours through a parameter store that is only available to administrators for Proof’s AWS account.

Accessing Proof’s internal support GUI also requires passing through two controls. First, an employee must be connected through the VPN. Second, he or she must login to the GUI. These logins are managed through Auth0. Employees whose job functions do not require active participation in trade monitoring activities have accounts that are restricted to read-only access.

While running, Proof’s trading system generates extensive logs and records. Records are stored first in a real time database and then transferred nightly into a historical database. For database management, we use SingleStore (fka MemSQL). The database runs in real time on the trading system VPC, though like all resources in the trading VPC, it can only be accessed indirectly by going through the management VPC. Reading the database requires an additional login. Employees who need to access the database are given individual accounts equipped with minimal permissions and protected by passwords.

Connections to Third Parties:

Our system connects to the internet through a NAT gateway. This is used to initiate connections with upstream and downstream parties that we need to interact with. This includes OMS/EMS providers, direct market access, and our market data provider.

We also connect to TNS using AWS Direct Connect, which uses IPSec and ultimately a physical cross-connect to get to the datacenters where our counterparties reside. For this purpose, our system uses a distinct private IP space that is whitelisted by TNS (this IP space cannot communicate with our other private IP subnets, and it is strictly for TNS connectivity). TNS also provides IP whitelisting for any inbound connections (though there are none at this point, currently we only connect outbound).

Detection, Response, and Recovery:

Logs are kept of employee accesses to AWS resources, and attempts at unauthorized access raise automatic alerts.

In response to any detected problems, we have the option to shut down the trading system. There are multiple means to do that, as detailed in our risk management documentation. The detailed logs of our system can be used to diagnose what happened and identify the specific account and/or action that caused a problem and the extent of the exposure. A threat assessment will then identify any potential for further compromised accounts/credentials etc. For example, if an employee account for some service has been compromised, it is reasonable to suspect that other accounts maintained by that employee may also be compromised. Once potential compromises are identified, accesses can be revoked. For example, we actively maintain and refresh a Certificate Revocation List (CRL) associated with the VPN endpoint so that VPN certificates can be immediately revoked. Also public keys can be removed from the whitelists for logging into particular machines, employee accounts and passwords can be reset, etc.

To recover, access can be re-established as appropriate by generating fresh keys or certificates and setting fresh passwords.

Section 4. Code and Data Security at Proof

The trading system itself is not the only critical asset Proof must protect. Our software and our persistent data storage are also subject to the threat of unauthorized access, as well as the threats of accidental corruption and loss.

Our own software is maintained in git repositories on BitBucket (https://bitbucket.org/product/). Access to BitBucket (and related Atlassian tools like Jira that are used in our software development processes) is controlled through password-protected accounts. Git repositories provide protection against corruption or loss by automatically storing prior versions of the software.

Our CTO receives email alerts for commits to important code repositories (``commits” are git-speak for “changes to the software”). Our software development life cycle process is also designed to minimize the risk of an inadvertent or malicious change to our software. In particular, every release of our software is accompanied by a notification containing an exhaustive list of commits made since the last release.

Open source software that we rely upon, like Redis, must be periodically updated (this includes applying any security-related patches). For the web VPC, security patches are applied any time the software is rebuilt (which we do periodically). For the trading VPC, jump hosts, and database servers, security patches are applied automatically on a weekly basis (every Friday).

We store logs from our trading system and other persistent data in S3 inside AWS. All data in AWS is encrypted at rest, and access is controlled through individuals’ AWS accounts. Data stored in S3 is automatically stored redundantly by Amazon to prevent data loss, along with automatic detection and recovery for any lost redundancy.

Additionally, all data that is required to be maintained as part of our corporate books and records is periodically backed up to Wasabi (https://wasabi.com/).

Section 5. Operational Security at Proof

Ultimately, good cybersecurity practices can reduce a threat surface or move a threat surface from one component to another, but they can never wholly eliminate a threat. We must still rely on software that will have flaws and humans who will leave their laptops in cabs and their passwords on post-it notes. If all of the protection mechanisms we have detailed above work as intended, it is unlikely that a malicious actor will access our trading system or obtain any confidential data by directly accessing our cloud environment or our persistent data storage mechanisms. However, employee account credentials and local devices remain subject to threats like theft, malware, and social engineering.

Relevant Employee Credentials and Accounts: Employees at Proof must maintain and control access to several different kinds of accounts. These include google accounts for their prooftrading.com email and other google services, Slack accounts for internal employee communications, AWS accounts, Atlassian accounts (for Bitbucket, Jira, and other software development tools), accounts with vendors, and accounts for regulatory reporting (e.g. for accessing FINRA OATS and CAT reporting). Relevant employee credentials also include the private keys that enable VPN access and logins to various AWS resources, as well as access credentials for Proof’s physical office space.

Preventative Measures:

Employees are required to maintain strong and distinct passwords for all of their work-related accounts. Private keys are to be stored locally on single devices and never transferred between devices. Two factor authentication should also be enabled whenever practical.

Detection, Response, and Recovery:

Employees themselves should immediately report any missing devices or suspected compromise of passwords or devices. Upon detection or suspicion of a threat, VPN certificates and public keys for login can be immediately revoked. Recovery requires the generation of fresh certificates and public keys on a recovered or new machine. Passwords for accounts can be changed and two factor settings can be updated to new devices as appropriate. We have an incident response plan available in Appendix A.

Section 6. Cybersecurity FAQ

Is using AWS less secure than having a proprietary infrastructure?

We can approach this from the following lens: what threats might apply to a virtual private cloud in AWS that do not apply to a proprietary architecture? As we previously mentioned, strong controls on the datacenter itself are table stakes and unlikely to be differentiating. Also, most people who have “on-prem” systems do not actually own their data center! They are renting space in someone else’s datacenter (e.g Equinix) anyway. Hence the difference mostly boils down to the possibility that AWS fails to properly isolate our resources on a private network, or we fail to configure our AWS resources appropriately. We note that immediately addressing any threats to proper resource segregation is core to the business of AWS, and AWS is highly resourced, experienced, and incentivized in this regard. Similarly, enabling customers to configure their VPCs in the ways that they intend and understand is also a core competency of AWS.

Whether it runs on a VPC or on proprietary infrastructure, a trading system will ultimately need to connect to external parties through mechanisms like physical connectivity, NAT gateways, and VPNs, and there is always a threat of some misconfiguration or other kind of failure through those surfaces. Security with physical connectivity is typically based on whitelisting well-known public IPs that counterparties communicate from. In our case, TNS performs this function for us. For most systems (including ours, we believe), the biggest threats are social engineering and attack through an administrator’s machine. This is not something that is differentiating between cloud vs. more in-house solutions.

Ultimately, there is not much basis to suspect that we would have a safer network configuration doing things ourselves as compared to building upon tools provided by AWS.

Might publishing this document make your security controls less effective?

Though some proponents of the practice of “security through obscurity” remain, we believe that keeping cybersecurity approaches undisclosed would ultimately yield less security rather than more. This is for a couple of reasons. First, since our defensive practices are drawn from standard, well-vetted, and common tools, they would not be that difficult to anticipate. And if they were not drawn from common and well-vetted tools, we would be on shaky ground. Second, hiding defensive practices removes scrutiny and a potential for feedback and collaboration that may drive security improvements. Developing cybersecurity practices in the absence of transparency removes some incentives for good design and contributes to a culture that is deficient in accountability.

Is a small company more likely to suffer a cybersecurity breach than a large one?

There are pros and cons to having a small team/company from a cybersecurity perspective. Since Proof currently has fewer than 10 employees, we have limited resources. This is why we heavily leverage the tools that AWS provides and maintains with a much larger team and wealth of experience.

Our small size also represents a small attack surface for phishing and other forms of social engineering. Since we all know each other well and communicate continually, it would be unrealistic for an outsider to impersonate a Proof employee, even through a spoofed email address, which can be a significant attack vector for large corporations. Our small size also means that a very limited number of eyes are authorized to access any particular resource or piece of data.

Are quantum computers something I should be worried about?

Meh. They’ve been “just on the horizon” or “less than 10 years away” now for a lot more than ten years. But nonetheless, if scalable quantum computers became available, their immediate impact on cybersecurity would be limited to the particular cryptographic algorithms that they threaten. Commonly used public key encryption and signature algorithms like RSA and those based on the discrete logarithm problem in elliptic curves are known to be vulnerable to quantum attacks at scale. However, there are well-studied and long-vetted alternative designs for public key encryption and signatures based on different underlying mathematics that are not known to be threatened by quantum attacks. These are not used widely today because they are less efficient than RSA and current elliptic curve based standards. But a gradual upgrade of cryptographic standards to switch to designs that are thought to resist quantum attacks is already underway.

What’s your take on blockchains?

It’s best not to ask us that unless you are prepared for a long rant about proper database design principles.

Appendix A: Incident Response Plan

Employees at Proof access sensitive resources through devices, namely laptops and phones. They also access the Proof office through key fobs and access funds through corporate credit cards. Should a device become lost, stolen, or otherwise compromised, it is important that we act quickly to cut off that device from access. Below we detail the steps to accomplish this. This process will be performed by the affected employee in collaboration with the CTO.

Step 1: Identify all vulnerable accounts and resources

We begin by identifying the full set of employee accounts and resources that might be accessible fully or partly through use of the compromised device. This will be a subset of the following list of all Proof accounts and resources (organized by category):

Internal and external communications: Google (Gmail and Google Drive), Slack, Atlassian (Confluence), Notion, Proof’s Twitter account, Proof’s Zoom account, Proof’s Medium account

Trading system and internal GUI access: AWS account, Jenkins, VPN certificates, Data Dog account

Regulatory and reporting resources: Apex Online account, FINRA Gateway account, FINRA CAT Account, CAIS reporting account

Database Access: OneTick access, SingleStore (Memsql) account

Software development: Bitbucket, Atlassian (Jira), Gitlab, Bitwarden

Finance: Gusto account (payroll), Brex (physical credit card as well as online account), First Republic (physical card as well as online account), benefits accounts (health insurance, retirement benefits)

Office: key fobs for building access and room access, SPACES online account

Other: Overleaf account (shared LaTex platform for research writeups), ShareFile

Step 2: Revoke access for all identified affected accounts and resources and restore legitimate access

For any potentially compromised account, passwords must be changed, and any 2-factor authentication mechanisms that point to the compromised device must also be changed. Any affected VPN certificates can be revoked by being added to our revocation list.

If any account’s access cannot be revoked (e.g. we cannot get into the account to change the password because the password has already been changed), the third parties who manage such accounts must be contacted to revoke access from the compromised accounts and restore access to the legitimate account owner.

For any account that was actively signed in on the lost/stolen/compromised device, it is also important to sign out of all active sessions. To do this on Slack for example, see https://slack.com/help/articles/214613347-Sign-out-of-Slack#mobile-1.

In the case of loss or theft of key fobs for access to the physical office, the office provider SPACES must be contacted to revoke access to those fobs and issue new ones to the affected employee. Each fob is associated specifically to an employee, so this can be easily done without disrupting the access of other employees.

Step 3: Screen for any potential activity on accounts between the potential compromise and the revocation/recovery of access.

Next we attempt to more fully understand the ramifications of the compromise and identify any further steps that need to be taken to remediate them. We do this by looking for and reviewing any potential activity that may have occurred through compromised accounts/devices during the period of potential compromise.

For accounts and resources related to the trading system or databases, system logs can be used to identify and investigate any activity that may have taken place during the period of potential compromise.

For accounts and resources related to our finances, online statements and transaction records can be used to screen for any unauthorized activity.

Step 4: Identify and adopt any indicated process improvements

Retrospectively, we will view each incident as an opportunity to identify potential improvements to our processes. In particular, any difficulties that arise during implementation of the above steps will be noted and possible remediations to make them faster and smoother in the future will be considered. Also, we will attempt to identify and remediate any unnecessary dependencies between devices that made accounts or resources vulnerable in a case where they didn’t need to be. The affected employee will have a brief retrospective meeting with either the CTO or the President to discuss what was learned through this process upon completion, and this may spur updates to our cybersecurity policies. This meeting will be treated as a positive opportunity for the affected employee to leverage their experience to help improve the company’s overall security, not as a vehicle for assigning blame. We believe that treating incidents as positive learning opportunities without an element of individual blame or punishment will encourage timely reporting of incidents, both large and small, and ultimately result in a stronger security posture. Any insights gleaned or policy updates resulting from this process will then be shared with the whole team.

--

--