This is a quick start on using the Cosmos DB Bindings with HTTP Triggers on Isolated Runtime Azure Functions.


I threw this together because all the Isolated Runtime examples were using Queues which don't require the extra output that HTTP Triggers require to provide a proper return status.

Ch...

Today I wanted to dig into utilizing Data Zones in your data estate. You probably already know the basics around raw, enriched, and production.

Made a quick video on the difference between Allow Azure Services and Allow Trusted Azure Services.

Today just wanted to highlight a simple Azure SQL and Azure Queue example that uses Azure Functions to schedule and move data to SQL Azure from a REST API in parallel. It shows a heavy use of bindings to keep the code very tight while being highly functional. My favorite part is the SchedulePokemonQueue function, in 11 lines of code it gets all of the data from the SQL DB and drops them onto the queue to be processed with essentially just some scaffolding and a for loop.

User Defined Functions (UDFs) allow you to easily build logic to process columns in Spark but often can be inefficient, especially when written in Python. Scala UDFs are significantly faster than Python UDFs. As in orders of magnitude faster. Recently worked with someone that needed a UDF to process a few hundred GB of data. When switching from a Python UDF to prebuilt Scala UDF processing time went from 8 hours and giving up to around 15 minutes. Finding how to do this though was a challenge, so I want to document the process for others.

Had a recent issue come up where a customer was trying to use the Python Library twobitreader in a UDF to pull out some genetic information for individual genes. Think of it like being able to look up a range of characters from a file and output them as a string. The problem they were running into...