RAVINDRANATHREDDY - MSBI - BLOG: September 2009

Wednesday, September 30, 2009

http://blogs.msdn.com/bi/
http://blogs.msdn.com/msbi/default.aspx

http://www.tech-faq.com/
(it is a good site for all basic concepts)

http://sqlblogcasts.com/blogs/martinisti/default.aspx
(70-446)

MSBI-Data Mining

What is data Mining???
Data mining is usually defined as searching, analyzing and sifting through large amounts of data to find relationships, patterns, or any significant statistical correlations. With the advent of computers, large databases and the internet, it is easier than ever to collect millions, billions and even trillions of pieces of data that can then be systematically analyzed to help look for relationships and to seek solutions to difficult problems. Besides governmental uses, many marketers use data mining to find strong consumer patterns and relationships. Large organizations and educational institutions also data mine to find significant correlations that can enhance our society.
Purpose of Data Mining?
Data mining uses a relatively large amount of computing power operating on a large set of data to determine regularities and connections between data points. Algorithms that employ techniques from statistics, machine learning and pattern recognition are used to search large databases automatically. Data mining is also known as Knowledge-Discovery in Databases (KDD).
You can see video about Data Mining from the below link.

http://blog.sqlauthority.com/2009/09/03/sql-server-what-is-data-mining-a-simple-introductory-note/

SSIS Video’s-Some IMP Links

Hi,
I captured some videos from the below blog.
Enjoy;-)

http://blogs.msdn.com/mattm/
blog.
Its really good one.
Please don’t miss it.

SQL PASS 2009
I was just looking at the sessions for the SQL PASS 2009 Summit and was happy to see so many SSIS presentations on the list!
Andy Leonard
Applied SSIS Design Patterns
Steve Simon
SQL Server Integration Services and the modern financial institution
Davide Mauri
Instrumenting, Monitoring and Auditing of SSIS ETL Solutions
Brian Knight
Loading a Data Warehouse with SSIS
Dave Fackler
ETL from the Trenches: Using SSIS in the Real World
Joy Mundy
ETL: The Linchpin for the Complete Data Warehouse
Erik Veerman
Overcoming SSIS Deployment and Configuration Challenges Data Profiling and Cleansing with Integration Services 2008

SSWUG.ORG Ultimate Virtual Conference – April 2009
Samples for the CDC & Merge demo
Avoiding Common Pitfalls in SSIS

Matt Masson - Incremental Data Warehouse Loads with Merge & CDC
John Welch - Doing More (ETL) With Less (Effort) by Automating SSIS
Brian Knight - Loading a Data Warehouse With SSIS
Anthony D’Angelo - Deep Dive: Extending SSIS with .NET Development

SSIS - Tutorial: SSIS Performance Videos

These all are useful links about SSIS-Videos.
This post is coming a little late (these have already been announced here and here), but in case you haven’t seen these before, I wanted to bring attention to four new performance related created by the SQLCAT and SSIS teams. They mostly focus on larger scale/Enterprise level package deployments, but there are good tips for all levels of SSIS usage.
From the SQLCAT post:
Measuring and Understanding the Performance of Your SSIS Packages in the Enterprise (SQL Server Video) Link Watch this video
Author: Denny Lee, Microsoft Corporation
This video demonstrates how to measure and understand the performance of packages, based on lessons learned from enterprise customers. In this video, you will learn the following guidelines for improving performance:
How the limits of the source system affect performance.
Why disk I/O is important.
Why you should establish a package performance baseline
Tuning Your SSIS Package Data Flow in the Enterprise (SQL Server Video) Link Watch this video
Author: David Noor, Microsoft Corporation
This video demonstrates how to improve the performance of the data flow in an Integration Services package. In this video, you will learn how to tune the following phases of the data flow:
Extraction
Transformation
Loading
You can apply these performance tuning tips when you design, develop, and run the data flow.
Understanding SSIS Data Flow Buffers (SQL Server Video) Link Watch this video
Author: Bob Bojanic, Microsoft Corporation
This video looks at the memory buffers that transfer data to the data flow of an Integration Services package. The video includes the following items:
Demonstration of a simple package that shows you how Integration Services divides the data along the data flow paths and how buffers carry data along those paths. This simple package performs the following operations:
Data extraction
Character mapping
Creation of new columns by using a synchronous transformation
Multicasting
Sorting by using a blocking (asynchronous) transformation.
Design guidelines for building and maintaining the data flow of a package.
Designing Your SSIS Packages for Parallelism (SQL Server Video) Link Watch this video
Author: Matt Carroll, Microsoft Corporation
This video demonstrates how to increase the performance of your Integration Services packages by designing for parallelism.
------
These four videos join the growing list of SSIS tutorial videos in the SQL Server Video series. Others that might interest you:
Creating a Basic Package
Exporting SQL Server Data to Excel
How to: Automate SSIS Package Execution by Using the SQL Server Agent
How to: Call a Web Service by Using the Web Service Task
How to: Implement a Lookup Transformation in Full Cache Mode
How to: Use the Data Profiling Task
Troubleshooting: SSIS Package Execution Using SQL Server Agent

SQL Server Performance Tuning

5 Key Areas You Must Focus on to Improve Performance & Availability.
www.Heroix.com

SQL Server – Use Of NOLOCK / ROWLOCK

Hi,
While we are working to retrieve huge data or sum(*) data with (NOLOCK) will give better performance. After some quick searching, I found a great page on Sql-Server-Performance.com. Apparently SQL server puts a locking mechanism around all data access and manipulation to prevent things like dirty reads and the reading of uncommitted data. I was totally not aware of this - SQL server just worked and that was good.
Apparently though, these locks (as with most locking) comes with a perforance hit. This is good though as it ensures the integrity of your data. However, there are times when you just don't care. Some tables, especially look up tables, are not updated often (if ever) and locking around these brings on unnecessary overhead.
ROWLOCKUse row-level locks when reading or modifying data.
PAGLOCKUse page-level locks when reading or modifying data.
TABLOCKUse a table lock when reading or modifying data.
DBLOCKUse a database lock when reading or modifying data.
UPDLOCKUPDLOCK reads data without blocking other readers, and update it later with the assurance that the data has not changed since last read.XLOCKUse exclusive locks instead of shared locks while reading a table, and use hold locks until the end of the statement or transaction.
HOLDLOCKUse a hold lock to hold a lock until completion of the transaction, instead of releasing the lock as soon as the required table, row, or data page is no longer required.
NOLOCKThis does not lock any object. This is the default for SELECT operations. It does not apply to INSERT, UPDATE, and DELETE statements.

Using the SQL directives NOLOCK and ROWLOCK can circumvent this SQL Server's locking mechanism and speed up queries. Here, I demonstrate using it on a SQL Join query in which I get Blog entry information:
Examples:SELECT OrderIDFROM Orders (WITH ROWLOCK)WHERE OrderID BETWEEN 100AND 2000
UPDATE Products (WITH NOLOCK)SET ProductCat = 'Machine'WHERE ProductSubCat = 'Mac'

· SELECT
· b.id,
· b.name,
· ( t.id ) AS tag_id,
· ( t.name ) AS tag_name
· FROM
· blog_entry b
· INNER JOIN
· blog_entry_tag_jn btjn (NOLOCK)
· ON
· b.id = btjn.blog_entry_id
· INNER JOIN
· tag t (NOLOCK)
· ON
· btjn.tag_id = t.id
Notice that by using the NOLOCK directive on the blog_entry_tag_jn and the tag Tables. The Tag table pretty much never gets updated and the blog_entry_tag_jn (joining of entries to tags) table gets updated ONLY when add or update a blog entry. Due to the low frequency of updates, the requirement for locking on these tables is (next to) pointless. By using the NOLOCK directive I am asking SQL to ignore all locking mechanism surroundings those tables and proceed directly to data-retrieval. Theoretically, this should provide a small performance gain
Thanks
Rav’s.

RAVINDRANATHREDDY - MSBI - BLOG

MSBI (SSIS/SSRS/SSAS) Online Training