Friday, March 23, 2012

How to determine Count of ties in MDX?

I need to determine the number of values in a set that are non-unique in the set.

In SQL, I would do the following:

select sum(value_count) as Ties

from (select value, count(*) as value_count

from source

group by value

having count(*) >1

)

In MDX, I have a set defined in a query that is passed into several custom statistical functions. I would like to be able to accomplish this without writing another external function, as it seems the sort of thing that should be easy...

Here is the query I'm trying to plug this into. Suggestions welcome.

with

set [data] as ( nonempty({

[Sample Date].[Time].[Day] }) *{ [Measures].[Parm Value] } )

MEMBER [Sen's Slope] as 'Statistics.Sen_Slope({[data]})'

MEMBER [Kendall's Tau] as 'Statistics.KENDALLS_TAU({[data]})'

MEMBER [Significant] as 'Statistics.Significance({[data]})'

MEMBER [MK_Z] as 'Statistics.Mann_Kendall_Z({[data]})'

member [measures].[Samples] as [data].count

member [measures].[Average] as avg([data])

member [measures].[Median] as Median([data])

member [measures].[Last Sample] as tail([data],1).item(0)

MEMBER [Ties] as '0'

select {[Samples],[Average],[Median],[Last Sample], [Kendall's Tau], [Sen's Slope], [MK_Z], [Significant], [Ties] } on 0

FROM ( SELECT ( [Sample Date].[Time].[2002 Q4]:[Sample Date].[Time].[2006 Q3] ) on 0,

{([Site Hierarchy].[Site Hierarchy].[Site ID].[0829SD6001] ,

[Analysis Parameter].[Analysis Parameter].[Acidity (ACD)])} ON 1

FROM [Sample Data])

;

Hi Clayton,

This sample from Adventure Works, where value is [Reseller Order Quantity], seems to work:

SQL query:

>>

select sum(order_count) as Ties
from
(select OrderQuantity, count(*) as order_count
from
(select ProductKey, sum(OrderQuantity) as OrderQuantity
from dbo.FactResellerSales
group by ProductKey) ps
group by OrderQuantity
having count(*) > 1) oc

87

>>

MDX query:

>>

With

Member [Measures].[PrdTies] as

Count(Filter(Order(NonEmpty(

[Product].[Product].[Product].Members,

{[Measures].[Reseller Order Quantity]}),

[Measures].[Reseller Order Quantity], BDESC) as OrdPrds,

(OrdPrds.CurrentOrdinal < OrdPrds.Count

And [Measures].[Reseller Order Quantity] =

([Measures].[Reseller Order Quantity],

OrdPrds.Item(OrdPrds.CurrentOrdinal)))

OR (OrdPrds.CurrentOrdinal > 1

And [Measures].[Reseller Order Quantity] =

([Measures].[Reseller Order Quantity],

OrdPrds.Item(OrdPrds.CurrentOrdinal-2)))))

select {[Measures].[PrdTies]} on 0

from [Adventure Works]

PrdTies
87

>>

|||

Deepak,

Thanks for the thourough response. I've done some testing, comparing results from the MDX you wrote to the ADOMD CLR proc I wrote, and I was getting different results. I've traced through the CLR version and manually counted the values in the set, and the CLR proc was correct. I've clipped out the relevant code in case you want to compare results. It may be the differences in how I'm calling the CLR functions by passing in a set versus the Measure method, which is working over the complete set of data from the subcube query.

Here is your test MDX query, with a measure added for the CLR function. In this case, both methods return the same count, as expected. I'm just curious as to what might be causing me to get dfferent counts between the two methods when used in my cube...

With

set [data] as ( NonEmpty([Product].[Product].[Product].Members)

*([Measures].[Reseller Order Quantity]) )

member [CountOfTies] as 'Statistics.CountOfTies({[data]})'

Member [Measures].[PrdTies] as

Count(Filter(Order(NonEmpty(

[Product].[Product].[Product].Members,

{[Measures].[Reseller Order Quantity]}),

[Measures].[Reseller Order Quantity], BDESC) as OrdPrds,

(OrdPrds.CurrentOrdinal < OrdPrds.Count

And [Measures].[Reseller Order Quantity] =

([Measures].[Reseller Order Quantity],

OrdPrds.Item(OrdPrds.CurrentOrdinal)))

OR (OrdPrds.CurrentOrdinal > 1

And [Measures].[Reseller Order Quantity] =

([Measures].[Reseller Order Quantity],

OrdPrds.Item(OrdPrds.CurrentOrdinal-2)))))

select {[Measures].[PrdTies], [CountOfTies]} on 0

from [Adventure Works];

Assembly code:

using System;

using System.Collections;

using Microsoft.AnalysisServices;

using Microsoft.AnalysisServices.AdomdServer;

namespace Statistics

{

public sealed class Statistics

{

private Statistics()

{

}

[CLSCompliant(false)]

// function to return a sum of the counts for tied values

public static Object CountOfTies(Set data1)

{

try

{

//number of Tuples in the set

Int32 v = data1.Tuples.Count;

if (v > 1)

{

//dimension a new array to hold values for passing into GetCountOfTies()

Double[] ValueArray;

ValueArray = new Double[v];

MDXValue mdxval;

Int32 i = 0;

// convert tuple set into an array

foreach (Tuple t in data1.Tuples)

{

mdxval = t;

ValueArray[i++] = mdxval.ToDouble();

}

// get array of any ties

Int32[] ResultArray = GetCountOfTies(ValueArray);

//sum counts to get results to return

i = 0;

foreach (Int32 j in ResultArray)

{i = i + j;}

return i;

}

else {return 0;}

}

catch (Exception e)

{

Console.WriteLine(e);

return "ERROR";

}

}

private static Int32[] GetCountOfTies(double[] doubleArray)

{

//create an array of counts of unique values, where there are ties in values.

//the resulting array will contain one element for each value with ties.

Array.Sort(doubleArray);

ArrayList numbers = new ArrayList();

double? lastVal = null;

int currentCount = 0;

foreach (double val in doubleArray)

{

if (lastVal != null && lastVal != val)

{

if (currentCount > 1)

numbers.Add(currentCount);

currentCount = 1;

}

else

{currentCount++;}

lastVal = val;

}

return (Int32[])numbers.ToArray(typeof(Int32));

}

}

}

No comments:

Post a Comment