# Dataset

## Queries

***

### <mark style="background-color:yellow;">get\_dataset</mark>

Get dataset details and metadata.

#### Parameters

* `id` (conditionally required): The ID of the dataset.
* `solution_id` (conditionally required): Solution ID where dataset is present.
* `slug` (conditionally required): Slug of the dataset.

{% hint style="warning" %}
Either `id` or `solution_id` and `slug` both must be passed to fetch the dataset.
{% endhint %}

#### Example with ID

```bash
curl -X POST 'https://app.polyteia.com/api' \
    -header "Content-Type: application/json" \
    -header "Authorization: Bearer <your_access_token>" \
    --data '{
        "query": "get_dataset",
        "params": {
            "id": "ds_cv33u4n0i6q45p93i930"
        }
    }'
```

#### Example with Solution ID and slug

```bash
curl -X POST 'https://app.polyteia.com/api' \
    -header "Content-Type: application/json" \
    -header "Authorization: Bearer <your_access_token>" \
    --data '{
        "query": "get_dataset",
        "params": {
            "solution_id": "sol_cv33u4n0i6q45p93i930",
            "slug": "my_unique_slug"
        }
    }'
```

#### **Dataset Response**

* `id`: The ID of the dataset.
* `organization_id`: The ID of the organization the dataset belongs to.
* `solution_id`: The ID of the solution the dataset belongs to.
* `created_at`: The date and time when the dataset was created.
* `updated_at`: The date and time when the dataset was last updated.
* `name`: The name of the dataset.
* `description`: The description of the dataset.
* `slug`: The unique slug of the dataset within a solution.
* `source`: Source specified by the user of the data of the dataset.
* `documentation`: A JSON object containing block type documentation for the dataset. *Documentation of this json object coming soon.*
* `metadata` : See [Dataset Metadata](#dataset-metadata)

#### Dataset Metadata

* `asset_info`: Information about the data uploaded in the dataset.
  * `content_type`: The MIME type of uploaded data.
  * `size`: Total size of uploaded data in bytes.
  * `storage_backend`: The storage of the dataset where the data is stored.
  * `uploaded_at`: The last uploaded time of the data in the dataset.
* `schema`: The schema of the data within the dataset.
  * `columns`: Technical details of all columns in the dataset.
    * `canonical_type`: A human facing type of column, like URL, phone number etc.
    * `database_type`: The actual database field type.
    * `preview_name`: A human friendly display name for the column.

#### Example Response

```json
{
    "data": {
        "id": "ds_cv33u4n0i6q45p93i930",
        "organization_id": "org_cv4v6p51bmu5rjcjk720",
        "solution_id": "sol_cv33u4n0i6q45p93i930",
        "name": "Lorem ipsum dolor sit ame",
        "slug": "my_unique_slug",
        "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
        "source": "RRR",
        "created_at": "2025-03-11T17:51:07.257829Z",
        "updated_at": "2025-03-11T18:30:18.866332Z",
        "documentation": {
            "663423b1-cc90-4fe3-8988-dbe9079cd53e": {
                "id": "663423b1-cc90-4fe3-8988-dbe9079cd53e",
                "meta": {
                    "depth": 0,
                    "order": 0
                },
                "type": "HeadingOne",
                "value": [
                    {
                        "children": [
                            {
                                "text": "Lorem Ipsum"
                            }
                        ],
                        "id": "38a979e2-c34a-4c43-bf03-39c0727639ae",
                        "props": {
                            "nodeType": "block"
                        },
                        "type": "heading-one"
                    }
                ]
            }
        },
        "metadata": {
            "asset_info": {
                "content_type": "text/csv",
                "size": 103775,
                "storage_backend": "s3",
                "uploaded_at": "2025-03-11T18:30:18Z"
            },
            "schema": {
                "columns": {
                    "Anzahl Untersuchungen": {
                        "canonical_type": "number",
                        "database_type": "BIGINT",
                        "preview_name": "ANZAHL UNTERSUCHUNGEN"
                    },
                    "Gemeinde": {
                        "canonical_type": "text",
                        "database_type": "VARCHAR",
                        "preview_name": "GEMEINDE"
                    },
                    "coordinates": {
                        "canonical_type": "text",
                        "database_type": "VARCHAR",
                        "preview_name": "COORDINATES"
                    }
                }
            }
        }
    }
}
```

***

### <mark style="background-color:yellow;">list\_datasets</mark>

Paginated list of all datasets subject to the user's access rights within a solution.

#### Parameters

* `page` (required): The page number to return. `minimum: 1`
* `size` (required): The number of items to return per page. `minimum: 1, maximum: 100`
* `solution_id` (required): The ID of the solution to view all datasets from.

#### Example

```bash
curl -X POST 'https://app.polyteia.com/api' \
    -header "Content-Type: application/json" \
    -header "Authorization: Bearer <your_access_token>" \
    --data '{
        "query": "list_datasets",
        "params": {
            "page": 1,
            "size": 10,
            "solution_id": "sol_cv33u4n0i6q45p93i930"
        }
    }'
```

#### Response

* `total`: The total number of solutions subject to the user's access rights.
* `page`: The current page number.
* `size`: The number of items requested per page.
* `items`: An array of solutions. See [Dataset Response](#dataset-response) for more details.

```json
{
  "data": {
    "total": 1,
    "page": 1,
    "size": 100,
    "items": [
      {
        "id": "ds_cv33u4n0i6q45p93i930",
        "organization_id": "org_cv4v6p51bmu5rjcjk720",
        "solution_id": "sol_cv33u4n0i6q45p93i930",
        "name": "Lorem ipsum dolor sit ame",
        "slug": "my_unique_slug"
      }
    ]
  }
}
```

***

## Commands

***

### <mark style="background-color:yellow;">create\_dataset</mark>

Use this command to create a new dataset.

#### Parameters

* `solution_id` (required): The ID of the solution to create the dataset in.
* `name` (required): Name of the dataset. `minimum: 1` `maximum: 50` `unicode characters only`
* `description` (optional): Description of the dataset. `maximum: 255` `unicode characters only`
* `source` (optional): The source of the data in the dataset. `maximum: 255` `unicode characters only`
* `documentation` (optional): A JSON object containing block type documentation for the dataset.
* `slug` (optional): A user provided URL friendly id of the dataset. This must be unique within the solution where dataset is being created. If not provided, system tries to generate the slug from name. It is however advisable to always pass this field for better slug management. `maximum: 50` `allowed characters: a-z, 0-9, _`

#### Example

```bash
curl -X POST 'https://app.polyteia.com/api' \
    -header "Content-Type: application/json" \
    -header "Authorization: Bearer <your_access_token>" \
    --data '{
        "query": "create_dataset",
        "params": {
            "solution_id": "sol_cv33u4n0i6q45p93i930",
            "name": "Lorem ipsum dolor sit ame",
            "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
            "source": "RRR",
            "slug": "my_unique_slug"
        }
    }'
```

#### Response

The response is the [Dataset Response](#dataset-response) of the created dataset.

```json
{
    "data": {
        "id": "ds_cv33u4n0i6q45p93i930",
        "organization_id": "org_cv4v6p51bmu5rjcjk720",
        "solution_id": "sol_cv33u4n0i6q45p93i930",
        "name": "Lorem ipsum dolor sit ame",
        "slug": "my_unique_slug",
        "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
        "source": "RRR",
        "created_at": "2025-03-11T17:51:07.257829Z",
        "updated_at": "2025-03-11T18:30:18.866332Z"
    }
}
```

***

### <mark style="background-color:yellow;">update\_dataset</mark>

Use this command to update basic details of a dataset.

#### Parameters

* `id` (required): ID of the dataset to update.
* `name` (required): Name of the dataset. `minimum: 1` `maximum: 50` `unicode characters only`
* `description` (optional): Description of the dataset. `maximum: 255` `unicode characters only`
* `source` (optional): The source of the data in the dataset. `maximum: 255` `unicode characters only`
* `documentation` (optional): A JSON object containing block type documentation for the dataset.
* `slug` (optional): A user provided URL friendly id of the dataset. This must be unique within the solution where dataset is being created. If not provided, system tries to generate the slug from name. It is however advisable to always pass this field for better slug management. `maximum: 50` `allowed characters: a-z, 0-9, _`

#### Example

```bash
curl -X POST 'https://app.polyteia.com/api' \
    -header "Content-Type: application/json" \
    -header "Authorization: Bearer <your_access_token>" \
    --data '{
        "query": "update_dataset",
        "params": {
            "id": "ds_cv33u4n0i6q45p93i930",
            "name": "Lorem ipsum dolor sit ame",
            "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
            "source": "RRR",
            "slug": "my_unique_slug"
        }
    }'
```

#### Response

The response is the [Dataset Response](#dataset-response) of the updated dataset.

```json
{
    "data": {
        "id": "ds_cv33u4n0i6q45p93i930",
        "organization_id": "org_cv4v6p51bmu5rjcjk720",
        "solution_id": "sol_cv33u4n0i6q45p93i930",
        "name": "Lorem ipsum dolor sit ame",
        "slug": "my_unique_slug",
        "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
        "source": "RRR",
        "created_at": "2025-03-11T17:51:07.257829Z",
        "updated_at": "2025-03-11T18:30:18.866332Z"
    }
}
```

***

### <mark style="background-color:yellow;">delete\_dataset</mark>

Use this command to delete a dataset.

{% hint style="danger" %}
**This action is irreversible. Use this command with caution. When you delete a dataset, the data within the dataset is also irreversibly deleted.**
{% endhint %}

#### Parameters

* `id` (required): The ID of the dataset to delete.

#### Example

```bash
curl -X POST 'https://app.polyteia.com/api' \
    -header "Content-Type: application/json" \
    -header "Authorization: Bearer <your_access_token>" \
    --data '{
        "query": "delete_dataset",
        "params": {
            "id": "ds_cv33u4n0i6q45p93i930"
        }
    }'
```

#### Response

```json
{
    "data": {
        "id": "ds_cv33u4n0i6q45p93i930"
    }
}
```

* `id`: The ID of the dataset that was deleted.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.polyteia.com/api-docs/en/dataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
