Using -1 for Unknown Dimension member with Netezza Can Cause Slowness
There could be a problem when using typical practice dimensional design with the IBM PureData for Analytics server. The extremely fast data warehouse appliance can slow to a stop when distributing on a primary key that uses a single row for unknown. This happens when a large amount of data is associated with this row causing a skew to a single data slice.
A common practice with dimensional modeling is to use a surrogate primary key that consists of sequential meaningless numbers. A business key in the dimension is used to lookup the surrogate key when loading facts. To handle the case where the business key from the fact table is missing or not found in the dimension a typical design is to use an entry using something like -1 that will never collide with the normal numbers.
A table in Netezza is distributed across a number of data slices and when a fact and dimension are joined in a query, their related rows must be physically together on the same slice before parallel processing. If they are not, the rows are redistributed or broadcast so that they are. When there are two very large tables to join, performance can be increased by setting the distribution on each table to the columns they share in the relationship.
If this is done when a large amount of data is associated with the unknown row, that data will be skewed to a single data slice. Since your query is only as fast as the slowest data slice, this will be the bottleneck.
A solution to this problem can be found here. (I never thought I would advocate adding 10,000 dummy records.) When applying this approach to a large dimension table, try to generate negative numbers that span the approximate range as the positive numbers. Otherwise, they could all still skew to the same data slice.
I’ve used technical books from Packt Publishing for years and they are always a great asset. I’ve also served as technical editor for the book “IBM Cognos BI v10.2 Administration Essentials“.
The following is a quote of their special offer:
Packt Publishing celebrates their 2000th title with an exclusive offer – We’ve got IT covered!
Known for their extensive range of pragmatic IT ebooks, Packt Publishing are celebrating their 2000th book title `Learning Dart’– they want their customers to celebrate too.
To mark this milestone Packt Publishing will launch a ‘Buy One Get One Free’ offer across all eBooks on March 18th – for a limited period only.
David Maclean, Managing Director explains `It’s not by chance that this book is our 2000th title. Our customers and community drive demand and it is our job to ensure that whatever they’re working on, Packt provides practical help and support.
At Packt we understand that sometimes our customers want to learn a new programming language pretty much from scratch, with little knowledge of similar language concepts. Other times our customers know a related language fairly well and therefore want a fast-paced primer that brings them up to a competent professional level quickly.
That’s what makes Packt different: all our books are specifically commissioned by category experts, based on intensive research of the technology and the key tasks.’
Since 2004, Packt Publishing has been providing practical IT-related information that enables everyone to learn and develop their IT knowledge, from novice to expert.
Packt is one of the most prolific and fast-growing tech book publishers in the world. Originally focused on open source software, Packt contributes back into the community paying a royalty on relevant books directly to open source projects. These projects have received over $400,000 as part of Packt’s Open Source Royalty Scheme to date.
Their books focus on practicality, recognising that readers are ultimately concerned with getting the job done. Packt’s digitally-focused business model allows them to quickly publish up-to-date books in very specific areas across a range of key categories – web development, game development, big data, application development, and more. Their commitment to providing a comprehensive range of titles has seen Packt publish 1054% more titles in 2013 than in 2006.
Erol Staveley, Publisher, says `Recent research shows that 88% of our customers are very satisfied with the service knowing that we offer a wide breadth of titles in a timely manner, and owing to the quality of service that they receive 94% of customers are willing to recommend Packt to friends and family. It’s great that we’ve hit such a significant milestone, and we want to continue delivering this fantastic content to our customers.’
Here are some of the best titles across Packt’s main categories – but Buy One, Get One Free will apply across all 2000 titles:
Cannot use Open with Explorer in a SharePoint library – solved. (“Your client does not support opening this list with Windows Explorer”)
When browsing in a SharePoint document library, there is an option on the ribbon bar to Open with Explorer. This is very exciting because you could copy and paste multiple files into SharePoint or move them from one folder to another. When clicking on this button you get “Your client does not support opening this list with Windows Explorer”.
The solution is to add a registry entry. The root problem is that in this situation, your credentials are not passed when the target URL has a dot in the address. The following procedure, from this Microsoft article describes how to fix things:
- Click Start, type regedit in the Start Search box, and then press ENTER.
- Locate and then click the following registry subkey:
- On the Edit menu, point to New, and then click Multi-String Value.
- Type AuthForwardServerList, and then press ENTER.
- On the Edit menu, click Modify.
- In the Value data box, type the URL of the server that hosts the Web share, and then click OK.
Note You can also type a list of URLs in the Value data box. For more information, see the “Sample URL list” section in this article.
- Exit Registry Editor.
At the time of this writing, these two ebooks introducing the 2012 version of SQL Server and Windows are free Kindle books on Amazon:
IBM Cognos Statistics error RSV-CCP-0005 OXML not found
When installing Cognos Statistics, the installation may go smoothly except when trying to run a report, such as the included sample reports, you get an error RSV-CCP-0005 ‘OXML not found’.
OXML is the output file from SPSS, the statistics software acquired by IBM and integrated into a product that adds Statistics processing and graphing on to IBM Cognos Business Intelligence. It turns out that the root cause of this error is that IBM Cognos Statistics cannot be installed in a directory with a space in the directory path. See this IBM note: Space in the install path causes RSV-CCP-0005 error when opening or running Statistics reports.
This is unfortunate, because the overwhelming majority of the time people install Cognos BI under ‘Program Files’ or ‘Program Files (x86)’ and Cognos Statistics can naturally be installed in the same directory.
Install Cognos Statistics in a directory with no spaces in the path, say, “c:\IBMCognosStatistics”. Now, there will be a separate service and it needs a different port address than the IBM Cognos BI Server. When you run Cognos Configuration for Cognos Statistics (assuming it’s the same server), use the same URL’s, except change the port number. For example, instead of 9300, use 9310. Do this for the dispatchers, the logging port and the shutdown port. Only leave the Content Manager port the same as Cognos. This is because it will communicate with Cognos BI Server through the same Content Store.
Be sure Cognos BI is up and running before setting this configuration because it will need the Content Manager service for encryption and registering the new service. When you save and start, there will be a new service under Windows Services for Statistics. Ideally, you will start IBM Cognos service before the IBM Cognos Statistics service and conversely, shut down the Statistics service before shutting down the Cognos BI service.
Anecdotally, this error began with version 10.1, but I haven’t confirmed this. It seems the kind of problem that should be fixed in future releases. Maintaining a separate install outside of the standard area is undesirable.
In a green field situation where you are starting from an empty server, you might consider installing Cognos BI and Cognos Statistics in the same directory with no spaces in the path. In this case, there will be only one configuration and you won’t need separate ports or a separate service.