Let's see how the power query cache works (in both Excel and Power BI Desktop). A "persistent cache" stored on disk, when refreshing queries. But the persistent cache does not store the results of your M Queries. Instead, it stores the results of the behind-the-scenes requests sent to data sources. The warning like "This preview may be up to .... old" happens because there is a persistent cache used for interactive previewing of query results. This cache is long-lived and is intended to make the experience of working in the editor faster.
So Query1 queries a SQL Server database and returns a single unfiltered table. In this case, the persistent cache will now know the result of sending. If another M query (whether through referencing Query1, or by querying the same table directly) needs the same result, the persistent cache can provide it, and the result won't have to be fetched a second time from the SQL Server
We will see that this does not work and this sometimes creates performance problems
Let's try to understand how the Power Query cache works. To do this we need to prepare the test environment.Now let's start by uploading data from the sql server and in detail from the contoso database using the following query t-sql
Insert the t-sql query in the dialog box of a direct query connection as shown in the image below
I thought the Power Query engine first evaluates "Test", and after that happened the data goes from there to R1, R2, R3. Actually, what happens is the opposite: R1, R2 and R3 are evaluated in parallel independently because each of them refers to "test". Power Query persistent cache is updated through a background thread. And the separate assessments running at the same time are not coordinated; when the R1 evaluation accesses the persistent cache this does not prevent the R2 or R3 evaluation from accessing the cache. This means that even when using a shared persistent cache, PQ can potentially require the same data twice. It depends on the timing of the various requests.
This happens because we have enabled the "Enable parallel table loading" flag if instead we set the "Enable parallel loading of tables" on off,
so the queries above are evaluated in series, the SQL Profiler shows that the sql server is called a only once.
The Profiler shows that only one of the queries on R1, R2 and R3 is requested from the SQL Server while the other two are taken from the cache. In fact, even if the three queries cause the "test" evaluation, every time "test" is evaluated, it is possible to reuse the data stored in the persistent cache.
On the contrary, with the three queries updated in parallel, each "test" evaluation takes place in the cache. This is slower, but in this case not much slower. In other cases, the difference between parallel and sequential query execution could be much wider.
to understand how the cache works, I used a beautiful chris webb article. About this feature in detail here https://blog.crossjoin.co.uk/2019/03/26/power-bi-caching-parallelism-and-power-query-refresh-performance/
As we start the new year, we’re excited to announce we’re rolling out report sharing to users worldwide. Now reports and dashboards have the...
Let's see how the power query cache works (in both Excel and Power BI Desktop). A "persistent cache" stored on disk, when refr...
The composite model expects to have huge tables as a DirectQuery source and some smaller tables in import mode. When we talk about huge tabl...