top of page
Writer's picturedazfuller

Dropping a SQL table in your Synapse Spark notebooks (Python Edition)

So since writing the original post about how to drop a SQL table from a Spark Notebook I've been meaning to follow it up with a version for those using PySpark. It might seem a straight forward thing to do, but connectivity and quirks of Synapse don't make it quite so simple. But anyway, here's the short and (hopefully) easy-to-follow post on how to do it in PySpark. If you want to know why these posts exist then I encourage you to read the original.


This time we're going to use PyODBC which is already deployed with Spark pools, so no need to install additional libraries. The first thing we need to do is, as before, use Synapse to get our connection string and an access token.

import struct
import pyodbc

Okay, so that actual first thing is a couple of imports, but then it's getting our information.

server = f"{mssparkutils.env.getWorkspaceName()}.sql.azuresynapse.net"
dbname = "curation"
auth_key = bytes(mssparkutils.credentials.getToken("DW"), 'utf8')
driver = "{ODBC Driver 17 for SQL Server}"
table = "dbo.example_table"

connection_string = f"DRIVER={driver};SERVER={server};DATABASE={dbname}"

Again we're using mssparkutils to get our workspace name, and we're using it to get an access token.


Connecting using PyODBC and an access token is possible in newer versions with the latest drivers, along with a host of other Active Directory based connectivity options. But connecting with a token is pretty poorly documented right now with the best information coming from a github issue. The short version of this though is that you need to take the authentication key and convert it to the right type.

exp_token = b""
for i in auth_key:
    exp_token += bytes({i})
    exp_token += bytes(1)
token_struct = struct.pack("=i", len(exp_token)) + exp_token

Once that's done though, connecting is pretty easy, and once connected we can check for our table and drop it if it exists (1256 is the SQL_COPT_SS_ACCESS_TOKEN option).

with pyodbc.connect(connection_string, attrs_before={ 1256:token_struct }) as conn:
    conn.autocommit = True
    cursor = conn.cursor()
    cursor.execute("SELECT OBJECT_ID(?, 'U')", table)
    object_id = cursor.fetchone()[0]
    if object_id is not None:
        print(f"Dropping table {table}")
        cursor.execute(f"DROP TABLE {table}")
    else:
        print(f"{table} does not exist")

There is (surprise) an issue with connecting to Azure Synapse through the ODBC driver however which can cause issues with transactions (the default for the driver). To work around this we have to set the connection auto-commit to true.


And that's it. Have fun and remember to double check which table you're dropping before you run the script 😉

1,005 views3 comments

Recent Posts

See All

3 Comments


vuboxefow
Nov 06

A dog portraits can brighten your home and heart. It’s a lasting tribute to the pet who has been your companion through thick and thin.

Like

Guest
Sep 24

Attending a live performance at Lincoln Center adds another layer of appreciation to their visit. With its abundance of artistic venues, a newyork tours can provide unparalleled access to the arts and a deeper connection to its entertainment legacy.

Like

Guest
Sep 24

Tours that include food tastings allow you to savor traditional British fare while discovering hidden gems in the city’s many neighborhoods. This london tickets combination of shopping and dining makes London an attractive destination for those looking to indulge in local culture.

Like
bottom of page