Gopal Krishna Ranjan

Gopal is a passionate Data Engineer and Data Analyst. He has implemented many end to end solutions using Big Data, Machine Learning, OLAP, OLTP, and cloud technologies. He loves to share his experience at https://sqlrelease.com//. Connect with Gopal on LinkedIn at https://www.linkedin.com/in/ergkranjan/.

Tidy Data in Python – First Step in Data Science and Machine Learning

Most of the Data Science / Machine Learning projects follow the Pareto principle where we spend almost 80% of the time in data preparation and remaining 20% in choosing and training the appropriate ML model. Mostly, the datasets we get to create Machine Learning models are messy datasets and cannot be fitted into the model […]

Tidy Data in Python – First Step in Data Science and Machine Learning Read More »

Python use case – Import data from excel to sql server table – SQL Server 2017

If we need to import data from an excel file into SQL Server, we can use these methods: SQL Server Import Export Wizard Create an SSIS package to read excel file and load data into a SQL Server table Use T-SQL OPENROWSET query Use the read_excel method of Python’s pandas library (Only available in SQL Server 2017

Python use case – Import data from excel to sql server table – SQL Server 2017 Read More »

Python use case – Import zipped file without unzipping it in SSIS and SQL Server – SQL Server 2017

Import zipped CSV file without unzipping it in SSIS using SQL Server 2017 SQL Server Integration Services (SSIS) is one of the most popular ETL tools. It has many built-in components which can be used in order to automate the enterprise ETL(Extract, Transform, and Load). Also, if we need a customized component which is not

Python use case – Import zipped file without unzipping it in SSIS and SQL Server – SQL Server 2017 Read More »

Import CSV file into SQL Server using T-SQL query

Sometimes, we need to read an external CSV file using T-SQL query in SQL Server. Due to some functional limitations, we cannot use the import-export wizard functionality in such kinds of scenarios as we need the result set in the middle of the execution of the other queries. There, we can use the BULK INSERT

Import CSV file into SQL Server using T-SQL query Read More »

Python use case – Dynamic UNPIVOT using pandas – SQL Server 2017

In this post, we are going to learn how we can leverage the power of Python’s pandas module in SQL Server 2017. pandas is an open source Python library providing data frame as data structure similar to the SQL table with the vectorized operation support for high performance. To know more about pandas, you can click

Python use case – Dynamic UNPIVOT using pandas – SQL Server 2017 Read More »

Handling special characters in Hive (using encoding properties)

In case we are reading a text file in a Hive table which contains non-English characters and we are not using the appropriate text encoding, these non-English characters might be loaded as junk symbols (like boxes – �). To get these characters in their original form, we need to use the correct character encoding. In this

Handling special characters in Hive (using encoding properties) Read More »

Skip header and footer rows in Hive

In this post “Skip header and footer rows in Hive“, we are going to learn that how we can ignore few header and footer records in Hive without loading or reading these records in another table or in a view temporarily. If you want to read more about Hive, visit my post “Preserve Hive metastore in

Skip header and footer rows in Hive Read More »

Preserve Hive metastore in Azure HDInsight

In this blog “Preserve Hive metastore in Azure HDInsight“, we are going to learn how we can preserve the hive metadata while working with the Azure HDInsight services. Microsoft Azure HDInsight is an on-demand managed Open source Big Data analytics service for the enterprises. We can provision clusters as per the demand in few minutes,

Preserve Hive metastore in Azure HDInsight Read More »