Sunday, August 24, 2014

Public Data Sets

Data has become vital part of our lives enabling us to become more knowledgeable and make right decisions. Some of the links directing to various data sets: 

  • List from Data Science Central   - http://www.datasciencecentral.com/profiles/blogs/big-data-sets-available-for-free  
  • 100+ Interesting Data Setsfor Statistics -   http://rs.io/2014/05/29/list-of-data-sets.html
  • Microsoft Azure Datasets -        http://datamarket.azure.com/
  • Google's Datasets Search Engine - https://www.google.com/cse/publicurl?cx=002720237717066476899:v2wv26idk7m&utm_source=hootsuite&utm_campaign=hootsuite
  • ImportIO - Make your own datasets from webpages - https://import.io/
  • Database Format of Wikipedia articles - http://wiki.dbpedia.org/Downloads39
  • Journalistic Datasets -  https://projects.propublica.org/data-store/
  • United States government data sets - http://www.data.gov/
  • United States government statistics sets  - http://www.usa.gov/Topics/Reference-Shelf/Data.shtml
  • United States weather data - http://www.ncdc.noaa.gov/
  • World Bank Data - https://finances.worldbank.org/
  • USA Financial Analysis from New York University http://pages.stern.nyu.edu/~adamodar/New_Home_Page/data.html
  • UCI Machine Learning Repository – probably one of the most popular and comprehensive sources around. 295 data sets in total.
  • Internet Measurement Data Catalog (DataCat) – the ultimate resource if you're looking for huge piles of data coming from internet traffic.
  • Wikipedia – this one might not be so obvious, but Wikipedia, aside from being one of the most popular web sites globally, is also committed to publish the metadata it collects. You can for example check out the Wikipedia's page view stats.
  • Data.gov – Catalog of U.S. Government data, consisting of more than 112 000 data sets. Not all of them are large, but definitely worth looking at. There are of course other local Open Data sources you might consider interesting as well (i.e. datahub project).
  • AWS' Public Data Sets –Amazon provides central data set repository with nice browsing capabilities.
  • Microsoft Azure MarketPlace – last but not least you can find some data inside Azure platform too. As the name implies not all of them are free though…
Original Sources of Information: Buckwoody and BigX Blogs

Sunday, July 6, 2014

Top Coding Competition Websites

  • ACM-ICPC is one of the largest programming contests run exclusively for college students. Its top teams compete for prize money in an Olympic style competition at the world finals. Partially sponsored by IBM, the annual contest primarily involves algorithmic programming problems, supported in C/C++ and Java. Google and other companies have hired a number of world finalists.
  • CodingBat is a “fundamentals” live coding site that offers up problems in Java and Python. For programmers looking to bone up on the basics, CodingBat provides instant feedback. Nick Parlante, a computer science lecturer at Stanford, started the site as a research project.
  • CodeChef hosts a programming contest at the start of each month and another, smaller challenge in the middle of the month. Its global programming community is described as a “noncommercial educational initiative” from Directi, an Internet products company based in Mumbai. The competition accepts solutions in 35+ programming languages including C,C++, Java and Python.
  • Codeforces regularly hosts about six contests a month. Russian developer Michael Mirzayanov created the site, which is configured to support about 17 programming languages. There’s also a blog area where members can wax philosophic about their programming dilemmas. Users can also create challenges of their own.
  • HackerRank, a social platform for programming competitions, runs the bulk of its contests for employers searching for new talent. Besides functional programming, contestants can solve problems in different CS domains like algorithms, machine learning and artifical intelligence.
  • TopCoder, part of IT consultant Appirio, bills itself as the world’s largest competitive software development community. A full 99 percent of the site is run for clients, with employers using it to vet talent or test new languages. TopCoder also runs its own weekly coding competitions for fun, usually single round, time-sensitive matches based on a variety of technologies, including VM and Python.
    Source for this article is dice.com

Saturday, May 31, 2014

FTP using Powershell

Below code can be used to perform FTP upload and Download using Powershell:

param([Parameter(Mandatory=$true,Position=0)] [string]$SourceFilePath, [Parameter(Mandatory=$true,Position=1)] [string]$TargetFilepath,[Parameter(Mandatory=$true,Position=2)][ValidateSet("Upload","Download")]$OperationType)

try
{

$Username = "temp-account"
$Password = "tempPassword"
     
         if($OperationType -eq "Upload")
 {
$serverUri = "ftp://ftp2.myweb.com" + $TargetFilepath
  }
          else
           {
              $serverUri =  "ftp://ftp2.myweb.com" + $SourceFilePath
            }  

     [System.Net.FtpWebRequest] $FTPrequest  =  [System.Net.FtpWebRequest]::Create($serverUri)
      $FTPRequest.Credentials = new-object System.Net.NetworkCredential($Username, $Password)
$FTPRequest.Timeout = 10000  #Time out in Milliseconds
$FTPRequest.UsePassive = $true
$FTPRequest.UseBinary = $false
$FTPRequest.KeepAlive = $false   #This is important property  and terminates connection after
                                                                     # completion of operation

if($OperationType -eq "Upload")
{
              $SourceFile = $SourceFilePath

$FTPrequest.Method = [System.Net.WebRequestMethods+Ftp]::UploadFile

Write-host "Uploading File..."  
#Create a request Stream
[System.IO.Stream] $requestStream = $FTPRequest.GetRequestStream()

if($requestStream -ne $null)
{
   #Read File to be uploaded
   $FileContents = Get-Content -path $SourceFile -encoding byte
#Write File
$requestStream.write($FileContents, 0, $FileContents.Length)
$requestStream.Close()  
$requestStream.dispose()

#Validating Status of Upload
[System.Net.FtpWebResponse] $response = $FTPrequest.GetResponse();
Write-host "File Upload Status $($response.StatusDescription)"  
$response.Close()
$response.dispose()
}
else
{
  [System.Net.FtpWebResponse] $response = $FTPrequest.GetResponse();
Write-host "File Upload Status $($response.StatusDescription)"  
}
}
elseif($OperationType -eq "Download")
{
           $TargetFile = $TargetFilepath
  Write-host "Downloading File..."
  $FTPRequest.Method = [System.Net.WebRequestMethods+Ftp]::DownloadFile;

  #Get the response object
[System.Net.FtpWebResponse] $FTPResponse = $FTPRequest.GetResponse()
if($FTPResponse -ne $null)
{
[System.IO.Stream] $responseStream = $FTPResponse.GetResponseStream()
System.IO.FileStream]$FileStream = New-Object System.IO.FileStream($TargetFile,[System.IO.FileMode]::Create)
 [byte[]]$ReadBuffer = New-Object byte[] 1024

 do {
              $ReadLength = $ResponseStream.Read($ReadBuffer,0,1024)
$FileStream.Write($ReadBuffer,0,$ReadLength)
                      } while ($ReadLength -ne 0)

 Write-host "File Download Status $($FTPResponse.StatusDescription)"

 $FileStream.Close()
 $FileStream.dispose()
         $responseStream.Close()  
 $responseStream.dispose()
}
}
}
 catch [Exception]
  {
    if($requestStream -ne $null)
{
 $requestStream.Close()  
 $requestStream.dispose()
}

if($responseStream -ne $null)
{
 $responseStream.Close()  
 $responseStream.dispose()
}

if($FileStream -ne $null)
{
 $FileStream.Close()  
 $FileStream.dispose()
}

    Write-host "Exception Type: $($_.Exception.GetType().FullName);`n
    Exception Code: $($_.Exception.ErrorCode);Exception Message: $($_.Exception.Message) "
  }


#Usage

#Upload
FTPUploadDownload.ps1 "D:\Temp\FileNametobeUploaded.txt"   "/FTPFolder/Targetfilename.txt"    Upload

#Download
 FTPUploadDownload.ps1 "/FTPFolder/Sourcename.txt"    "D:\Temp\DownloadedFileName.txt"  Download