Documentation

Learn a few of the features that are available in our platform.

Common Transformations

Transformations are operations that can be performed on your dataset in order to modify either the columns or the data values. A few of the more common transformations are listed below.

  • Splitting a column into multiple columns
  • Merging multiple columns into a single column
  • Extracting data from one column into a new column
  • Adding a new column with the results of mathematical operations
  • Deleting or renaming columns
  • Removing blank spaces from the ends of data values
  • Making data values into uppercase or lowercase
  • Replacing specific text from data values

Example 1 : Replacing NULL values with 0 for the user age column

Complex Transformations

The following list of transformations are complex transformations that require some explanation.

  • Aggregation : Grouping the same data values together, removing duplicate rows
  • Adding a new column with specialized mathematical operations (e.g., operations that aggregate the data)
  • Edit or delete all data values when some condition is satisfied
  • Extract JSON to remove nesting from JSON data
  • Explode JSON to create multiple rows from JSON arrays

Example 2 : Delete all rows that satisfy a complex condition

API

Datachili's REST API allows you to automate transformations, joins and unions on your datasets. The base URL for the API is http://datachili.com/api/v1 and your secret access token can be obtained at http://datachili.com/users/edit

List Datasets
URL /datasets/list
Method GET
Params
Response
{"datasets":[{"fileSize":241212, "id":1, "name": "IMDB"},{"fileSize":892223, "id":2, "name": "Books"}]}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/datasets/list");
HttpGet req = new HttpGet(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

HttpResponse authResponse = httpclient.execute(req);
System.out.println(authResponse);
JSONObject authResponseObj = new JSONObject(EntityUtils.toString(authResponse.getEntity()));
System.out.println(authResponseObj);
req.releaseConnection();
Transform
URL /datasets/transform
Method POST
Params
{"scriptId":"335"}
Response
{"jobId":539}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/datasets/transform");
HttpPost req = new HttpPost(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

List params = new ArrayList();
params.add(new BasicNameValuePair("scriptId", "335"));
req.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

HttpResponse authResponse = httpclient.execute(req);
System.out.println(authResponse);
JSONObject authResponseObj = new JSONObject(EntityUtils.toString(authResponse.getEntity()));
System.out.println(authResponseObj);
req.releaseConnection();
Join
URL /datasets/join
Method POST
Params
{"newDatasetName":"Unified Movies", "datasetId1":"599", "datasetId2":"600", 
"joinFields1":"actor_id|act_first_name", "joinFields2":"actor_id|act_first_name", "joinType":"inner"}
Response
{"jobId":129}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/datasets/join");
HttpPost req = new HttpPost(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

List params = new ArrayList();
params.add(new BasicNameValuePair("newDatasetName", "Foobar"));
params.add(new BasicNameValuePair("datasetId1", "599"));
params.add(new BasicNameValuePair("datasetId2", "600"));
params.add(new BasicNameValuePair("joinFields1", "actor_id|act_first_name"));
params.add(new BasicNameValuePair("joinFields2", "actor_id|act_first_name"));
params.add(new BasicNameValuePair("joinType", "inner"));
req.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

HttpResponse authResponse = httpclient.execute(req);
System.out.println(authResponse);
JSONObject authResponseObj = new JSONObject(EntityUtils.toString(authResponse.getEntity()));
System.out.println(authResponseObj);
req.releaseConnection();
Union
URL /datasets/union
Method POST
Params
{"targetId":"599", "targetColumns":"actor_id|act_first_name|act_last_name", 
"datasetIdToColumns":"599|actor_id|act_first_name|act_last_name:600|actor_id|act_first_name|act_last_name"}
Response
{"jobId":35}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/datasets/union");
HttpPost req = new HttpPost(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

List params = new ArrayList();
params.add(new BasicNameValuePair("targetId", "599"));
params.add(new BasicNameValuePair("targetColumns", "actor_id|act_first_name|act_last_name"));
params.add(new BasicNameValuePair("datasetIdToColumns",
                    "599|actor_id|act_first_name|act_last_name:600|actor_id|act_first_name|act_last_name"));
req.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

HttpResponse authResponse = httpclient.execute(req);
System.out.println(authResponse);
JSONObject authResponseObj = new JSONObject(EntityUtils.toString(authResponse.getEntity()));
System.out.println(authResponseObj);
req.releaseConnection();
Add File
URL /datasets/addFile
Method POST
Params
{"fileData":"TWFuIGlzIGRpc3Rpbmd1aXNoZWQsI", "userDefinedName":"Movies IMDB", 
"fileExtension":"csv", "hasHeader":true}
Response
{"datasetId":45}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/datasets/addFile");
HttpPost req = new HttpPost(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

Path source = Paths.get("Movies.csv");
String base64data = DatatypeConverter.printBase64Binary(Files.readAllBytes(source));
List params = new ArrayList();
params.add(new BasicNameValuePair("fileData", base64data));
params.add(new BasicNameValuePair("userDefinedName", "Movies IMDB"));
params.add(new BasicNameValuePair("fileExtension", "csv"));
params.add(new BasicNameValuePair("hasHeader", "false"));
req.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));
Add Database
URL /datasets/addDB
Method POST
Params
{"dbip":"foobar.com", "dbport":"5432", "dbname":"prod", "dbtable":"users", 
"dbusername":"admin", "dbpwd":"postgres", "dbSelected":"postgres"}
Response
{"error":false}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/datasets/addDB");
HttpPost req = new HttpPost(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

List params = new ArrayList();
params.add(new BasicNameValuePair("dbip", "foobar.com"));
params.add(new BasicNameValuePair("dbport", "5432"));
params.add(new BasicNameValuePair("dbname", "prod"));
params.add(new BasicNameValuePair("dbtable", "users"));
params.add(new BasicNameValuePair("dbusername", "admin"));
params.add(new BasicNameValuePair("dbpwd", "postgres"));
params.add(new BasicNameValuePair("dbSelected", "postgres"));
req.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

HttpResponse authResponse = httpclient.execute(req);
System.out.println(authResponse);
JSONObject authResponseObj = new JSONObject(EntityUtils.toString(authResponse.getEntity()));
System.out.println(authResponseObj);
req.releaseConnection();
Job Status
URL /jobs/status
Method GET
Params
{"jobId":"250"}
Response
{"status":"RUNNING", "type":"TRANSFORMATION", "datasetId":12, "name":"Transforming Movies"}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/jobs/status")
.setParameter("jobId", "250");
HttpGet req = new HttpGet(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

HttpResponse authResponse = httpclient.execute(req);
System.out.println(authResponse);
JSONObject authResponseObj = new JSONObject(EntityUtils.toString(authResponse.getEntity()));
System.out.println(authResponseObj);
req.releaseConnection();
Copy Script
URL /datasets/copyScript
Method POST
Params
{"datasetId":"599", "pasteDatasetId":"600"}
Response
{"success":true}
Java e.g.
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("datachili.com").setPort("80").setPath("/api/v1/datasets/copyScript");
HttpPost req = new HttpPost(builder.build());
req.addHeader("Authorization", "Token token=\"" + secretAccessToken + "\"");

List params = new ArrayList();
params.add(new BasicNameValuePair("datasetId", "599"));
params.add(new BasicNameValuePair("pasteDatasetId", "600"));
req.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

HttpResponse authResponse = httpclient.execute(req);
System.out.println(authResponse);
JSONObject authResponseObj = new JSONObject(EntityUtils.toString(authResponse.getEntity()));
System.out.println(authResponseObj);
req.releaseConnection();