How to Categorize and Prevent Risks of Sensitive Links in URLScan

Published in

Tinder Tech Blog

8 min readNov 7, 2022

Authors: Rojan Rijal, Tinder Security Labs | Johnny Nipper, Product Security Manager | Tanner Emek, Engineering Manager

Recently, Tinder Security Labs gave a talk at Recon Village @ Defcon 30 called “Scanning your way into internal systems via URLScan.” We went over examples of sensitive links indexed by URLScan that could be leveraged to gain access into corporate systems. In addition, we went over some mitigation examples that can help prevent accidental disclosure/indexing of these links.

Sensitive links and URLScan

We wanted to specifically create queries that would identify indexed sensitive links. The services we looked into ranged from file-sharing services, ticketing systems, and single sign-on services. A malicious user with these sensitive links could access shared files within the organization, gain full access to the internal ticketing system, or in the worst-case scenario gain access to companies’ single sign-on portal gaining access to applications assigned to an employee.

We specifically looked into URLScan as it is an industry-wide security application used to identify malicious websites. It allows one to determine malicious websites via both its API and Web portal. Due to its API support, it is used as an integration by various enterprise security vendors. Integrations can lead to misconfigurations in this case disclosing sensitive links to the public.

https://urlscan.io homepage

The Past

Since URLScan supports API calls to scan websites, various companies and vendors continue to integrate it into their products. These integrations, however, are sometimes misconfigured and result in security issues such as exposing sensitive information to unauthorized parties. One example of this is when GitHub accidentally disclosed private repository names and links between January to September 2021. In order to identify phishing and malicious websites, GitHub would trigger a URLScan request every time a GitHub Page was built. In doing so, the scan request was sent with visibility set as public. While the feature was designed to prevent abuse, it ended up disclosing private repository names and page URLs to the public.

Excerpt from the email sent by GitHub

GitHub is one example that stands out even though the impact of the disclosure was minimal. There are countless more cases with more severe impacts, such as leaked internal documents, internal portals and sensitive magic links, giving attackers authenticated access to internal services.

The Present

Similar to GitHub, we identified various cases of indexes that contained sensitive information in URLScan. While these links are for third-party services that companies use, it is important to highlight that it is neither the fault of URLScan nor of the third-party vendors who provided services to the affected companies.

Magic links for file-sharing services

File-sharing services such as Google Drive (Google Docs, Sheets, Slides, etc) and Office 365 allow users to share files through different permission models. The open permission model allows anyone with the link to access the document with read-only permission (unless changed to edit).

Magic link documents that are accessible to the public frequently happen when organizations share documents with external vendors. A disclosure of such links to malicious users allows them to access potentially sensitive information.

One of the common ways for these links to be disclosed is through URLScan when employees and email security products scan them for potential malicious documents. We used different queries to identify various documents and understand the impact of such disclosure. To make sure the results we got were successful, we focused on identifying common patterns that identified valid documents over malicious documents, binaries, or files.

After identifying common patterns on how folders are shared and what the end result of the link will be, we discovered some disclosed sensitive documents shared with a magic link. When accessing the URLScan result and the screenshot of those links, the impact ranged from accessing contract and HR documents to business plans for various organizations. We manually reported any links and documents that contained overtly sensitive information for example: employee PII, business planning, contract paperwork, bank routing details, etc.

Screenshot from https://urlscan.io of a sensitive folder, redacted.

Software-As-A-Service (SaaS) Access

Enterprise SaaS products often provide user registration, merger, and login through managed domains. Through these, an organization can make sure that all the employees are part of a single tenant making it easier to enforce security policies such as SSO, user invitations, etc. These features are often based on a domain allowlist controlled by the tenant.

How does domain allowlist work?

Domain allowlist follows the logic that an end user who has verified their email address should be able to automatically register for tenants/organizations that have allowlisted the domain into the account. The allowlist could be created either via a domain verification or by providing a list of allowed domains. This prevents the hassle of manually adding new users to an organization. A classic example of a domain allowlist is Slack. Slack allows organization administrators to identify a list of domains that can automatically join the instance upon email verification.

In addition to domain allowlist, when a user is invited to an organization the invitation link helps the user set up their account to access cross-tenant resources available to other users. We utilized this to identify any sensitive links that would allow malicious users to complete the registration process and access these systems directly. Compared to a simple magic link for ONE document, having a registration link gives users access to a plethora of sensitive information.

File sharing services

Enterprise file storage and sharing services allow administrators to invite their employees to a single-tenant system giving them access to all shared documents within the organization. In addition, for some services a registered user can also join other tenants that have allowlisted the domain address. After identifying 3 specific file-sharing services to review, we tested each to identify common patterns on the user invitation links. Once identified, by using the queries in URLScan we identified invitation links for various organizations. Usage of these invitation links would grant malicious users the ability to register, set up 2FA, and access documents shared within the organization.

Enterprise ticketing systems

Similar to file-sharing services, enterprise ticketing systems also follow domain capture and multi-tenant systems. A registered user with a specific domain can access different tenants for the same organization and view tickets. These tickets could range from support tickets to scrum planning and security issues. Identifying patterns in their invitation, signup and password reset URLs provide malicious users ways to discover those links in public indexes like the one of URLScan.

One of the ticketing systems we looked into had a custom domain used for email validation. With this information, we were then able to enumerate and identify invitation and email verification links sent to users of various organizations. Completing this registration will then allow the malicious actor to join the specific ticketing system and many others that the verified email domain could access.

List of joinable instances for a verified email instance on our test account

Single Sign-On (SSO) Portals

While file-sharing services and ticketing systems give access to some internal data, access to a Single Sign-On portal on behalf of an employee can give access to multiple internal applications. Many organizations send an invitation email to new hires on their first day. This invitation email allows new hires to set up their SSO account and access internal resources such as Workday, internal tools, JIRA, Confluence, and more. When these invitation links get indexed by services such as URLScan, it gives malicious actors a short window to register on behalf of the user and gain access to such information.

During our review of such invitation links, we noticed that organizations actively sent these links to URLScan. This most likely was due to the use of a security vendor who was forwarding links to URLScan with visibility set to the public. When first searched, the query for a specific SSO service returned 1300 invitation links. In addition, we noticed that every day about 20–50 new and active links were being indexed. The invitation link then allowed users to go through the full registration workflow from setting up passwords, security answers, and secret images to the 2fa settings. Once completed, it would then give access to the SSO portal on behalf of the invited user with their permission.

Snapshot of an SSO registration portal from URLScan

Conclusion

While preparing for the talk, we identified numerous sensitive links for various organizations that would give access to critical third-party services. As part of our mitigation efforts, we worked with URLScan to share the sensitive links and indexes for the deletion process. The URLScan team was proactive in receiving such information and safely removed all the data from the public search. For example, it is no longer possible to find a disclosed link for SSO invitations in URLScan.

Further mitigations

If you notice that your organization is directly affected by this, two approaches can be taken to help mitigate the disclosures. If a link is already indexed by URLScan, it can be reported as sensitive which starts an investigation internally at URLScan resulting in the removal of the scan. This helps mitigate risks of unauthorized access to documents and SaaS services.

In addition, it is important to investigate the current reporting flow to identify the root cause of disclosing the links to URLScan. For example, one common cause for sensitive links to be disclosed is via API calls through email security products. We would recommend understanding how the email security vendor performs email analysis and changing the URLScan setting to either private or unlisted if possible.

For further reading, URLScan has written an excellent blog detailing some security features that will help prevent similar disclosures: https://urlscan.io/blog/2022/07/27/scan-visibility-best-practices/.