Migrate hierarchical SQL Server data to Azure CosmosDB - sql-server

I am exploring options to migrate data from SQL Server 2016 into CosmosDB with SQL API. Source data is relational. I need to join them and migrate to CosmosDB db in such a way that each row is migrated as a document. Here is an example.
I want to migrate these individual Product table rows combining it’s related ProductSubCatalog table row something like below document.
{
"ProductID": 970,
"Name": "Touring-2000 Blue, 46",
"ProductNumber": "BK-T44U-46",
"MakeFlag": true,
"FinishedGoodsFlag": true,
"Color": "Blue",
"ModifiedDate": "2014-02-08T10:01:36.827",
"ProductSubCatalog": {
"ProductSubcategoryID": 3,
"Name": "Touring Bikes"
}
}
I tried DocumentDB Data Migration Tool with Source as “SQL” with below query.
SELECT
[ProductID],P.[Name], [ProductNumber], [MakeFlag],
[FinishedGoodsFlag], [Color], [SafetyStockLevel], [ReorderPoint],
[StandardCost], [ListPrice],
Ps.ProductSubcategoryID AS 'ProductSubCatalog.ProductSubcategoryID',
Ps.Name AS 'ProductSubCatalog.Name'
FROM
[Production].[Product] P
LEFT JOIN
Production.ProductSubcategory PS ON P.ProductSubcategoryID = ps.ProductSubcategoryID
FOR JSON PATH, ROOT('Product')
As the above query generates single JSON data for all rows in Product table, DocumentDB Data Migration Tool imports it as single document. Does someone know other options to achieve this without writing custom application to query to SQL Server data, form JSON document and import it? Any suggestions and pointers will be appreciated.

Related

Azure DataFactory. Convert date format on import to SQL DB

I am using Azure Datafactory to import a number of CSV files that I have into an Azure SQL DB. I have a problem related to the date fields in the database which are currently set as DATE (I have also tried 'DATETIME').
The date format in the CSV look like
20160700000000
and when I try and map the CSV headings to DB rows in Azure Datafactory it tells me they are incompatible.
Do I need to modify the type in the DB to something other than DATE/DATETIME or is there something I can do in the import pipeline within Datafactory?
Any help much appreciated!
You've got three options:
Fix the data source's export to use something normative (and I don't see any day in your presumably accurate sample of the data).
Massage the column with a hive job.
Utilize a stored procedure to fix the date column for inserting.
I'm assuming you're doing a straight table mapped dataset. If you have no control over the source data then you'll have to get creative.
I don't quite understand what the date time of the value "20160700000000". If it means "7/1/2016 12:00:00 AM", you could set the source structure like the followings to parse such value as dateTime:
"structure":[
{ "name": "lastlogindate", "type": "Datetime", "format": "yyyyMM00000000"}]
Refer to this link for more details: https://azure.microsoft.com/en-us/documentation/articles/data-factory-azure-sql-connector/#specifying-structure-definition-for-rectangular-datasets

How to investigate file structure in mongodb?

Being a newbie to mongodb, my apologies if this is a trivial question, hopefully not.
I've got a set of 10 json files generated on a mongodb db, for which I would like to investigate the structure (fields, names, sizes, etc.). I know how to import json files into a db/collection on my mongo system and then use dbfind. However this works for specific fields which you need to know in advance, right?
In a relational SQL DB I would simply import the files and then go through the field names, sizes, even export it to a standard file format (.CSV, etc).
When you know nothing about your source files, what is the equivalent
strategy for MongoDB?
Is there a command that lists all the fields/elements in the doc?
Can you export the DB/collection from mongo, to a different format
that is more manageable for discovery (e.g. .csv, or a SQL database).
Thanks in advance, p
In order to list all the fields in a mongo collection use variety https://github.com/variety/variety
OR
mr = db.runCommand({
"mapreduce" : "my_collection",
"map" : function() {
for (var key in this) { emit(key, null); }
},
"reduce" : function(key, stuff) { return null; },
"out": "my_collection" + "_keys"
})
Then run distinct on the resulting collection so as to find all the keys:
db[mr.result].distinct("_id")
["foo", "bar", "baz", "_id", ...]
MongoDB Get names of all keys in collection
FOR MONGO EXPORT
https://docs.mongodb.com/manual/reference/program/mongoexport/
For mongo export in csv format, you have to mention the fields you want to export, which you can get from above.
mongoexport --db users --collection contacts --type=csv --fields name,address --out /opt/backups/contacts.csv

Generalized way to extract JSON from a relational database?

Ok, maybe this is too broad for StackOverflow, but is there a good, generalized way to assemble data in relational tables into hierarchical JSON?
For example, let's say we have a "customers" table and an "orders" table. I want the output to look like this:
{
"customers": [
{
"customerId": 123,
"name": "Bob",
"orders": [
{
"orderId": 456,
"product": "chair",
"price": 100
},
{
"orderId": 789,
"product": "desk",
"price": 200
}
]
},
{
"customerId": 999,
"name": "Fred",
"orders": []
}
]
}
I'd rather not have to write a lot of procedural code to loop through the main table and fetch orders a few at a time and attach them. It'll be painfully slow.
The database I'm using is MS SQL Server, but I'll need to do the same thing with MySQL soon. I'm using Java and JDBC for access. If either of these databases had some magic way of assembling these records server-side it would be ideal.
How do people migrate from relational databases to JSON databases like MongoDB?
Here is a useful set of functions for converting relational data to JSON and XML and from JSON back to tables: https://www.simple-talk.com/sql/t-sql-programming/consuming-json-strings-in-sql-server/
I think one 'generalized' solution will be as follows:-
Create a 'select' query which will join all the required tables to fetch results in a 2 dimentional array (like CSV / temporary table, etc)
If each row of this join is unique, and the MongoDB schema and the columns have one to one mapping, then its all about importing this CSV/Table using MongoImport command with required parameters.
But a case like above, where a given Customer ID can have an array of 'orders', needs some computation before mongoImport.
You will have to write a program which can 'vertical merge' the orders for a given customer ID.For small set of data, a simple java program will work. But for larger sets, parallel programming using spark can do this job.
SQL Server 2016 now supports reading JSON in much the same way as it has supported XML for many years. Using OPENJSON to query directly and JSON datatype to store.
SQL Server 2016 is finally catching up and adding support for JSON.
The JSON support still does not match other products such as PostgreSQL, e.g. no JSON-specific data type is included. However, several useful T-SQL language elements were added that make working with JSON a breeze.
E.g. in the following Transact-SQL code a text variable containing a JSON string is defined:
DECLARE #json NVARCHAR(4000)
SET #json =
N'{
"info":{
"type":1,
"address":{
"town":"Bristol",
"county":"Avon",
"country":"England"
},
"tags":["Sport", "Water polo"]
},
"type":"Basic"
}'
and then, you can extract values and objects from JSON text using the JSON_VALUE and JSON_QUERY functions:
SELECT
JSON_VALUE(#json, '$.type') as type,
JSON_VALUE(#json, '$.info.address.town') as town,
JSON_QUERY(#json, '$.info.tags') as tags
Furhtermore, the OPENJSON function allows to return elements from referenced JSON array:
SELECT value
FROM OPENJSON(#json, '$.info.tags')
Last but not least, there is a FOR JSON clause that can format a SQL result set as JSON text:
SELECT object_id, name
FROM sys.tables
FOR JSON PATH
Some references:
https://docs.microsoft.com/en-us/sql/relational-databases/json/json-data-sql-server
https://docs.microsoft.com/en-us/sql/relational-databases/json/convert-json-data-to-rows-and-columns-with-openjson-sql-server
https://blogs.technet.microsoft.com/dataplatforminsider/2016/01/05/json-in-sql-server-2016-part-1-of-4/
https://www.red-gate.com/simple-talk/sql/learn-sql-server/json-support-in-sql-server-2016/
There is no generalized way because SQL Server doesn’t support JSON as its datatype. You’ll have to create your own “generalized way” for this.
Check out this article. There are good examples there on how to manipulate sql server data to JSON format.
https://www.simple-talk.com/blogs/2013/03/26/sql-server-json-to-table-and-table-to-json/

Reading from SQL API after writing with MongoDB API using CosmosDB

Let's say I'm adding the following document to CosmosDB collection called 'schoolInfo', using the MongoDB API.
var obj1 = {
"type":"student"
"studentId": "abc",
"task_status": [
{
"status":"Current",
"date":516760078
},
{
"status":"Late",
"date":1516414446
}
],
"student_plan": "n"
}
(A) How do I create a SQL API Cosmos instance to read that same collection data?
I need the SQL API because I need to read that data in an Azure Function, and according to this, only SQL and Graph APIs are available for Azure Functions.
https://docs.microsoft.com/en-us/azure/cosmos-db/serverless-computing-database#comments-container
(B) How would I query this from the SQL API? I'm assuming I can use schoolInfo as the Table Name, but how do I know/ensure Column Names are available - given that task_status is a nested/array object?
Any quick examples would be appreciated.

Insert Json array data using sql lite +ionic 3

I have an json array like this -
[
{
"name": "24 Jul 2015",
"address": "mohd.aquib09#gmail.com",
"phone": "456456465",
}
]
and my sql lite code is -
for(var i=0;i<t1.length;i++){
db.executeSql("INSERT INTO tbl_deal(name,address,phone)VALUES('"+t1[i].name+"','"+t1[i].address+"','"+t1[i].phone+"')", {})
}
my problem is when large data comes it take more time to insert the full data, any easy solution for this i mean without using the loop
The problem is that you are inserting one row at a time. That's why this solution will take more time on bulk data.
You can use prepared statement instead. Create batches of say 10000 rows and do bulk insert at a time. This will surely increase your query performance.
Here is the sample link you can refer: http://www.sqlitetutorial.net/sqlite-java/insert/

Resources