venerdì 17 luglio 2020

Aggregation + Composite Model in Power BI

The composite model expects to have huge tables as a DirectQuery source and some smaller tables in import mode. When we talk about huge tables, performance is always a challenge. In this session, we will talk about how aggregations can be useful for speeding up performance. You can see how gigabytes of data can be analyzed and used in less than a second. Learn about many tips, tricks, and insights into aggregations in Power BI and how to use them in a composite model.


martedì 24 dicembre 2019

Change the dataset of a Power BI report using powershell

Import-Module PowerBIPS

$authToken = Get-PBIAuthToken
Set-PBIGroup -authToken $authToken -name "test" -Verbose
Set-PBIReportsDataset -authToken $authToken -sourceDatasetName "xxx" -targetDatasetName "xxx" -Verbose

mercoledì 13 novembre 2019

New AI Function in Power Query

Is that ai functions are now available in power query when you’re in the query editor you’ll able to see that there are three types of AI Transformations available :
Text analytics
vision analytics
azure machine learning models

Additionally for nay of these options you’ll be able to pick out which premium capacity you want to run it off of and this is because these features are exclusive to premium so while you’re in desktop you pick out which specific capacity you want a reference and then in power bi service if you published up your reports to a premium workspace will automatically run them in the premium capacity associated with that workspace.

giovedì 1 agosto 2019

Upload data to a Power BI real-time

This blog will show you how to load data into a Power BI real-time data set using Powershell.
The following code load to a Power BI dataset from a table in the adventureworks database

#Variables - details of the connection, stored procedure and parameters
$connectionString = "server=localhost;database='Adventureworksdw2017';trusted_connection=true;";
$storedProcedureCall = [your store procedure;

#SQL Connection - connection to SQL server
$sqlConnection = new-object System.Data.SqlClient.SqlConnection;
$sqlConnection.ConnectionString = $connectionString;

#SQL Command - set up the SQL call
$sqlCommand = New-Object System.Data.SqlClient.SqlCommand;
$sqlCommand.Connection = $sqlConnection;
$sqlCommand.CommandText = $storedProcedureCall;


#SQL Adapter - get the results using the SQL Command
$sqlAdapter = new-object System.Data.SqlClient.SqlDataAdapter
$sqlAdapter.SelectCommand = $sqlCommand
$dataSet = new-object System.Data.Dataset
$recordCount = $sqlAdapter.Fill($dataSet)

#Close SQL Connection
$sqlConnection.Close();

#Get single table from dataset
$data = $dataSet.Tables[0]

$endpoint = [your dataset url]

#Loop through each row of data and create a new file
#The dataset contains a column named FileName that I am using for the name of the file
foreach($row in $data)
{

$payload = @{
"ProductKey" =$row.Item("ProductKey")
"SalesAmount" =$row.Item("SalesAmount")
}
Invoke-RestMethod -Method Post -Uri "$endpoint" -Body (ConvertTo-Json @($payload))
write-Host $row.Item("ProductKey")
}


at this link the script

giovedì 18 luglio 2019

Wrangling Data Flow

Organizations need to do data preparation/wrangling for accurate analysis of data that is more complex and continues to grow every day. Data preparation is also required so that organizations can use the data effectively in various business processes and reduce the time to value.

Wrangling Data Flow in Azure Data Factory allows you to do code-free data preparation/wrangling @cloud scale iteratively. Wrangling Data Flow integrates with Power Query Online and makes the best in class Power Query M functions available for data wrangling @ cloud scale via spark execution.

Wrangling Data Flow translates M generated by Power Query Online Mashup Editor into Spark code for cloud scale execution and provides best in class monitoring experience.

martedì 2 aprile 2019

How the power query cache works?

Let's see how the power query cache works (in both Excel and Power BI Desktop). A "persistent cache" stored on disk, when refreshing queries. But the persistent cache does not store the results of your M Queries. Instead, it stores the results of the behind-the-scenes requests sent to data sources. The warning like "This preview may be up to .... old" happens because there is a persistent cache used for interactive previewing of query results. This cache is long-lived and is intended to make the experience of working in the editor faster.

So Query1 queries a SQL Server database and returns a single unfiltered table. In this case, the persistent cache will now know the result of sending. If another M query (whether through referencing Query1, or by querying the same table directly) needs the same result, the persistent cache can provide it, and the result won't have to be fetched a second time from the SQL Server

We will see that this does not work and this sometimes creates performance problems


Let's try to understand how the Power Query cache works. To do this we need to prepare the test environment.Now let's start by uploading data from the sql server and in detail from the contoso database using the following query t-sql

Insert the t-sql query in the dialog box of a direct query connection as shown in the image below
After entering the query we click on ok. When we import the data we enter the query modification mode to be able to handle the query just created. Rename the query with the name "test"
Next we create three references of the "test" query. The references we call them R1, R2, R3 as shown in the image below:
Now you can close the power query window with "close and apply". We have almost completed the preparation of the test environment. We need to go into the power bi settings and make sure the flag is set to "Enabled parallel loading to tables" as shown in the image below
Now let's keep the sql server profiler open and refresh it in power bI. As we see the power query engine executes 4 queries to SQL Server that are run in parallel.


I thought the Power Query engine first evaluates "Test", and after that happened the data goes from there to R1, R2, R3. Actually, what happens is the opposite: R1, R2 and R3 are evaluated in parallel independently because each of them refers to "test". Power Query persistent cache is updated through a background thread. And the separate assessments running at the same time are not coordinated; when the R1 evaluation accesses the persistent cache this does not prevent the R2 or R3 evaluation from accessing the cache. This means that even when using a shared persistent cache, PQ can potentially require the same data twice. It depends on the timing of the various requests.

This happens because we have enabled the "Enable parallel table loading" flag if instead we set the "Enable parallel loading of tables" on off,

so the queries above are evaluated in series, the SQL Profiler shows that the sql server is called a only once.

The Profiler shows that only one of the queries on R1, R2 and R3 is requested from the SQL Server while the other two are taken from the cache. In fact, even if the three queries cause the "test" evaluation, every time "test" is evaluated, it is possible to reuse the data stored in the persistent cache.

On the contrary, with the three queries updated in parallel, each "test" evaluation takes place in the cache. This is slower, but in this case not much slower. In other cases, the difference between parallel and sequential query execution could be much wider.

to understand how the cache works, I used a beautiful chris webb article. About this feature in detail here https://blog.crossjoin.co.uk/2019/03/26/power-bi-caching-parallelism-and-power-query-refresh-performance/

Aggregation + Composite Model in Power BI

The composite model expects to have huge tables as a DirectQuery source and some smaller tables in import mode. When we talk about huge tabl...