How to copy data using copy activity - Microsoft Fabric (2023)

  • Article

In Data Pipeline, you can use the copy activity to copy data between data stores in the cloud.

Important

Microsoft Fabric is currently in PREVIEW. This information relates to a pre-release product, which is subject to significant changes prior to release. Microsoft makes no warranties, express or implied, with respect to the information provided here. Refer toAzure Data Factory-Dokumentationto the service in Azure.

After you copy the data, you can use other activities to further transform and analyze it. You can also use the Copy activity to publish transformation and analysis results for business intelligence (BI) and application consumption.

To copy data from a source to a destination, the service that performs the copy activity performs the following steps:

  1. Reads data from a source data store.
  2. Performs serialization/deserialization, compression/decompression, column mapping, etc. It performs these actions based on the configuration.
  3. Writes data to the target datastore.

requirements

To get started, you must meet the following requirements:

  • A Microsoft Fabric tenant account with an active subscription. Create an account for free.

  • Make sure you have a Microsoft Fabric-enabled workspace.

Add a copy activity using the copy wizard

Follow these steps to set up your copy activity with Copy Assistant.

Start with the copy wizard

  1. Open an existing data pipeline or create a new data pipeline.

  2. choosecopy dataon the canvas to openKopiassistenttool to get started. Or chooseUse the copy wizardout ofcopy datathe drop down list belowactivitiestab in the ribbon.

Configure your source

  1. Select a data source type from the category. As an example, you should use Azure Blob Storage. chooseAzure Blob Storageand then selectNext.

  2. Connect to your data source by selectingCreate a new connection.

    After you have decidedCreate a new connection, enter the required connection information, and then selectNext. See each data source type for details on how to connectconnection article.

    If you have existing connections, you can chooseExisting connectionand select your connection from the drop-down list.

  3. In this source configuration step, select the file or folder to copy, and then selectNext.

Configure your goal

  1. Select a data source type from the category. As an example, you should use Azure Blob Storage. chooseAzure Blob Storage, and then selectNext.

  2. You can either create a new connection that links to a new Azure Blob Storage account by following the steps in the previous section, or use an existing connection from the connection drop-down list. The ability totest connectionAndEditare available for each selected connection.

  3. Configure your source data and map it to your destination. Then selectNextto complete your target configurations.

Check and create your copy activity

  1. Review your copy activity settings in the previous steps and selectOkaycomplete. Alternatively, you can return to the previous steps to edit your settings in the tool if needed.

When you're done, the copy activity is added to your data pipeline canvas. All settings, including advanced settings for this copy activity, are available under the tabs when selected.

Now you can either save your data pipeline with this single copy activity or continue designing your data pipeline.

Add a copy activity directly

Follow these steps to add a copy activity directly.

Add a copy activity

  1. Open an existing data pipeline or create a new data pipeline.

  2. Add a copy activity by selecting eitherAdd pipeline activity>copy activityor by selectioncopy data>Add to canvasunderactivitiesthe tab.

Configure your general settings on the General tab.

For information about configuring your general settings, seeGenerally.

Configure your source on the Source tab.

  1. choose+ Dienext toConnectionto connect to your data source.

    1. In the pop-up window, select the data source type. As an example, use Azure SQL Database. chooseAzure SQL-Datenbank, and then selectKeep going.

    2. It navigates to the connection creation page. Enter the required connection information on the control panel, and then selectCabinet. See each data source type for details on how to connectconnection article.

    3. Once your connection is established, you will be brought back to the data pipeline page. Then selectTo updateto get the connection you created from the drop-down list. You can also select an existing Azure SQL Database connection directly from the dropdown menu if you have previously created it. The ability totest connectionAndEditare available for each selected connection. Then selectAzure SQL-DatenbankIConnectionTyp.

  2. Specify a table to copy. choosedata previewto view your source table. You can also useInquiryAndstored procedureto read data from your source.

  3. ExpandProgressivefor advanced settings.

Configure your goal on the Goal tab.

  1. Choose your goal type. This can either be your internal first-class data storage from your workspace, e.g. B. Lakehouse, or your external data storage. Take Lakehouse as an example.

  2. Choose usageHaus am SeeIWorkspace data store type. choose+ Die, and you will be taken to the Lakehouse creation page. Enter your Lakehouse name and then selectCabinet.

  3. Once your connection is established, you will be brought back to the data pipeline page. Then selectTo updateto get the connection you created from the drop-down list. You can also select an existing Lakehouse connection directly from the dropdown menu if you have previously created it.

  4. Specify a table or set the file path to define the file or folder as the target. Choose heretablesand specify a table for writing data.

  5. ExpandProgressivefor advanced settings.

Now you can either save your data pipeline with this single copy activity or continue designing your data pipeline.

Configure your mappings on the Mapping tab.

If the connector you are using supports mapping, you can go tomappingClick the tab to configure your mapping.

  1. chooseImport schemesto import your data schema.

  2. You can see the automatic mapping appear. Give herThosecolumn andGoalSplit. If you create a new table in the target, you can customize itGoalcolumn name here. If you want to write data to the existing target table, you cannot change the existing oneGoalcolumn name. You can also seeTypof source and target columns.

In addition, you can choose+ New mappingTo add a new mapping, chooseClearto clear and select all map settingsreset to defaultto reset all assignmentsThoseSplit.

Configure your type conversion

ExpandEnter the conversion settingsto configure your type conversion if needed.

For details on the settings, see the table below.

AttitudeDescription
allow data truncationAllow data truncation when converting source data to destination data of a different type during copying. For example, from decimal to integer, from DatetimeOffset to Datetime.
Treat the boolean as a numberTreat the boolean as a number. For example, treat true as 1.
DatoTime-FormatFormatting string when converting between dates with no time zone offset and character strings. For example "YYYY-MM-DD HH:mm:ss.fff".
DatoTimeOffset-FormatFormatting string when converting between time zone offset dates and character strings. For example "yyyy-MM-dd HH:mm:ss.fff zzz".
TimeSpan-FormatFormat the string when converting between time periods and strings. For example "dd.hh:mm:ss".
culturalCulture information to be used by conversion types. For example "en-us", "fr-fr".

Configure your other settings on the Settings tab.

TheIdeasThe tab contains settings for performance, staging, etc.

See the table below for a description of each setting.

AttitudeDescription
Intelligent throughput optimizationSpecify to optimize throughput. You can choose between:
Auto
Standard
Balanced
Maximal
If you chooseAuto, the optimal setting is dynamically applied based on your source-destination pair and data pattern. You can also customize your throughput and the custom value can be from 2 to 256, while higher value means more wins.
Degree of copy parallelismSpecify the degree of parallelism to use when loading the data.
fault toleranceIf you choose this option, you can ignore some errors that occurred during the copying process. For example, incompatible rows between source and destination storage, file deleted during data movement, etc.
Activate loggingIf you select this option, you can log copied files, skipped files, and lines
Enable stagingSpecify whether to copy data through a buffer. Enable deployment only for the beneficial scenarios.
Staging Account ConnectionBy choiceEnable stagingSpecify an Azure Storage data source connection as a temporary staging store. choose+ Dieto create a staging connection if you don't already have one.

Next Step

  • Connector overview
  • How to monitor pipeline runs
Top Articles
Latest Posts
Article information

Author: Saturnina Altenwerth DVM

Last Updated: 2023/02/04

Views: 5941

Rating: 4.3 / 5 (44 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Saturnina Altenwerth DVM

Birthday: 1992-08-21

Address: Apt. 237 662 Haag Mills, East Verenaport, MO 57071-5493

Phone: +331850833384

Job: District Real-Estate Architect

Hobby: Skateboarding, Taxidermy, Air sports, Painting, Knife making, Letterboxing, Inline skating

Introduction: My name is Saturnina Altenwerth DVM, I am a witty, perfect, combative, beautiful, determined, fancy, determined person who loves writing and wants to share my knowledge and understanding with you.