Building Secure NL to SQL Solutions: Information Security Considerations

Published in

資安工作者的學習之路

9 min readJun 4, 2024

最近在Azure User Taiwan Group看到有篇 NL to SQL Architecture Alternatives 的文章，由於對這種使用情境與技術並不熟悉，但又覺得這個情境可以討論一下資訊安全的議題，所以花點少少的時間來寫總結與資安考慮的面向。

Natural Language to SQL Architecture (microsoft.com)

原文

Azure User Taiwan Group 版主分享了一篇NL to SQL Architecture Alternatives 的文章，大型語言模型理解人類語意的能力，使得查詢資料有了不同的風貌，微軟 Azure AI 解決方案架構師 Mark Remmey 分享了三種利用自然語言輸入直接產生 SQL 查詢語法的架構。然而 Mark Remmey 也提醒，動態產生 SQL 語法也讓資安議題 SQL injection 攻擊將伴隨著出現，使用自然語言轉換 SQL 時必須關注 SQL injection 攻擊的風險。

NL to SQL Architecture Alternatives

Introduction The recent advances in LLMs have created opportunities for businesses to gain new insight into their…

techcommunity.microsoft.com

架構選擇：

文章提出了多種架構選擇，每種選擇都有其優勢和適用場景，主要架構包括：

使用 Azure Cognitive Services 的預建模型。
使用自定義模型和 Azure Machine Learning。
基於 Azure Functions 和 Logic Apps 的無服務架構。
使用 Azure SQL Database 和 Azure Cosmos DB 的資料存儲選項。

使用 Azure Cognitive Services 的預建模型：

優點：實施快速，無需大量定制和訓練。
缺點：靈活性較低，對於複雜查詢的支持有限。

使用自定義模型和 Azure Machine Learning：

優點：高度靈活，能夠處理複雜查詢和特定業務需求。
缺點：實施成本高，需要專業的資料科學和機器學習知識。

基於 Azure Functions 和 Logic Apps 的無服務架構：

優點：成本低，易於擴展，實施快速。
缺點：需要熟悉無服務架構的設計和運維，對於複雜查詢可能不夠高效。

使用 Azure SQL Database 和 Azure Cosmos DB 的存儲選項：

優點：強大的資料存儲和查詢能力，支持大規模資料處理。
缺點：資料庫優化和管理的複雜性較高。

我自己不是開發人員，碰過的大概就 Azure SQL Database，其他的如 Azure Functions 和 Logic Apps 就是在 Lab上自建環境測試居多。同樣的 AWS 的Lambda 是很常見無伺服器服務，使用時要多加考慮資安的 Best Practice 。

企業的選擇

中小企業：可以優先考慮使用 Azure Cognitive Services 的預建模型或基於 Azure Functions 的無服務架構，以降低實施成本和複雜性。
大企業和有特殊需求的企業：應該考慮使用自定義模型和 Azure Machine Learning 來實現高度靈活和定制的解決方案，雖然初期投入較高，但長期回報更大。
為 NL to SQL 解決方案提供了多種架構選擇，這些選擇能夠滿足不同企業的需求。傳統 SQL 查詢的門檻較高，而 NLQ 系統能夠讓更多非技術用戶參與資料分析，提升企業的資料利用率。在選擇適合的架構時，企業需要綜合考慮成本、實施難度、性能和未來擴展性，以確保所選方案能夠長期支持業務發展。

最後討論到不論採用何種架構,都需要深入考慮以下幾個關鍵因素:

對話能力:支持自然語言的上下文理解及多輪會話
隱私和安全:資料隱私合規及安全性控制
可擴展性:支持大量並行查詢及長期擴展
成本和效益:工程投資與效益的權衡

資訊安全

NL to SQL 解決方案涉及自然語言處理（NLP）、DB 查詢和雲端服務的多個層面，這些層面都可能面臨各種安全風險。

原文提到有以下的風險與控制措施

Risks:

SQL Injection
Handling ambiguity and vagueness
Mitigating bias
Unintentionally writing to database

Mitigations:

Granular permissions and proper authorization — refine access based on roles and groups.
Ensure access to known users.
Implement parameterized queries that separate data from the query itself.
Ensure read-only and only execute permissions.
Implement strict input validation and sanitization procedures with whitelisting, regular expressions, data type checks, escaping.
Implement logging and monitoring.
Explainability and transparency

SQL injection - SQL Server

Learn how SQL injection attacks work. Mitigate such attacks by validating input and reviewing code for SQL injection in…

learn.microsoft.com

雖然常見的風險都有提到，但可以再考慮一些其他面向

隱私與資料保護：

使用資料加密技術保護靜態和動態資料。Azure 提供了透明資料加密（TDE）和各種存儲加密等功能。
考慮加密、匿名化、資料遮罩等技術，確保敏感資料在轉換和傳輸過程中不會外洩
將機敏資料與非機敏數據做區隔存儲
針對使用 OpenAI 模型的架構，考慮模型安全性，如防止模型被惡意操縱、對抗性攻擊等。
強化 Database 的安全性

資料庫安全性概觀 - Azure Cosmos DB

了解 Azure Cosmos DB 如何為您的資料提供資料庫保護和資料安全性。

learn.microsoft.com

安全性概觀 - Azure SQL Database & Azure SQL Managed Instance & Azure Synapse Analytics

瞭解 Azure SQL 資料庫和 Azure SQL 受控執行個體和 Azure Synapse Analytics 中的安全性，包括它與 SQL Server 有何不同。

learn.microsoft.com

Enterprise security and governance - Azure Machine Learning

Securely use Azure Machine Learning: authentication, authorization, network security, data encryption, and monitoring.

learn.microsoft.com、

身份驗證與授權：

強制啟用多因素驗證（MFA）
確保只有授權 User 能夠訪問 NLQ 系統，利用 Azure Active Directory（AAD）管理，
API Key 的使用與管理

想到有看到一篇有趣的文章是有關 ChatGPT的 API Key使用所產生的資安問題：

API key compromised, API key security

I have an iOS App using openai api for chatgpt 3.5 only. over the weekend I found out that my api key was used for…

community.openai.com

防止惡意攻擊：

使用參數化查詢和 Prepared Statement 來避免 SQL 注入攻擊。
透過自然語言處理（NLP）模型進行查詢解析時，加入安全檢查和過濾機制。
輸入驗證與過濾機制
定期進行稽核、弱點掃描與滲透測試

日誌與監控：

留存日誌，考慮架構圖上不同平台與系統之間的串接來留存日誌紀錄
使用 Azure Monitor 和 Azure Security Center 進行監控和威脅監測，或搭配 Azure Sentinel
設置自動化告警以處理異常行為。
Serverless 的安全性經常被遺忘，安全控制措施可以參考：

Securing Azure Functions

Learn about how to make your function code running in Azure more secure from common attacks.

learn.microsoft.com

List of AWS Config Managed Rules

You can use the following AWS Config managed rules to evaluate whether your AWS resources comply with common best…

docs.aws.amazon.com

Security in AWS Lambda

Configure AWS Lambda to meet your security and compliance objectives, and learn how to use other AWS services that help…

docs.aws.amazon.com

aws-config-rules/aws-config-conformance-packs/Security-Best-Practices-for-Lambda.yaml at master ·…

Node, Python, Java] Repository of sample Custom Rules for AWS Config. …

github.com

本篇透過 AI 協助分析原文與翻譯，透過AI來協助完成文章

我想把各種經驗寫出來做分享教學，希望把社群的分享風氣帶出來給大家。並期望之後有人也可以寫出不同的心得文，如果是自修同學對於申請考試和準備上有任何問題，可以透過 LinkedIn 交朋友與 Facebook 來聯絡我，能力範圍內盡量幫你解決(或是你想認識我出來喝杯咖啡也歡迎，我很喜歡多認識業界的朋友們交流，也真的不少人找我聊聊過了!)。

其他聯絡方式 : https://kuronetwork.me/contact/
所有文章： https://kuronetwork.me/posts/
關於我：https://kuronetwork.me/about/
LinkedIn： https://www.linkedin.com/in/kurohuang/

Building Secure NL to SQL Solutions: Information Security Considerations

原文

NL to SQL Architecture Alternatives

Introduction The recent advances in LLMs have created opportunities for businesses to gain new insight into their…

架構選擇：

企業的選擇

資訊安全

SQL injection - SQL Server

Learn how SQL injection attacks work. Mitigate such attacks by validating input and reviewing code for SQL injection in…

資料庫安全性概觀 - Azure Cosmos DB

了解 Azure Cosmos DB 如何為您的資料提供資料庫保護和資料安全性。

安全性概觀 - Azure SQL Database & Azure SQL Managed Instance & Azure Synapse Analytics

瞭解 Azure SQL 資料庫 和 Azure SQL 受控執行個體 和 Azure Synapse Analytics 中的安全性，包括它與 SQL Server 有何不同。

Enterprise security and governance - Azure Machine Learning

Securely use Azure Machine Learning: authentication, authorization, network security, data encryption, and monitoring.

API key compromised, API key security

I have an iOS App using openai api for chatgpt 3.5 only. over the weekend I found out that my api key was used for…

Securing Azure Functions

Learn about how to make your function code running in Azure more secure from common attacks.

List of AWS Config Managed Rules

You can use the following AWS Config managed rules to evaluate whether your AWS resources comply with common best…

Security in AWS Lambda

Configure AWS Lambda to meet your security and compliance objectives, and learn how to use other AWS services that help…

aws-config-rules/aws-config-conformance-packs/Security-Best-Practices-for-Lambda.yaml at master ·…

Node, Python, Java] Repository of sample Custom Rules for AWS Config. …

Written by Kuro Huang

瞭解 Azure SQL 資料庫和 Azure SQL 受控執行個體和 Azure Synapse Analytics 中的安全性，包括它與 SQL Server 有何不同。