R 3.3.2 now available

R 3.3.2, the latest update to the R language, was released today. Binary releases for Linux and Mac are available now from your local CRAN mirror, and the Windows builds will be available shortly.
As a minor update to the R 3.3 series, this update focuses mainly on fixing bugs and doesn’t make any major changes to the langauge. As a result, you can expect existing scripts and packages to continue to work if you’re upgrading from R 3.3.1. This update includes some performance improvements (particularly in calculation of eigenvalues), better handling of date axes in graphics, and improved documentation for the methods package. (Fun fact: when printed as a 9Mb PDF reference manual, the documentation for the R base and recommended packages now runs to 3452 pages. That’s almost 3 copies of War and Peace!) 
The nickname for this release is “Sincere Pumpkin Patch”, in recognition of the Halowe’en release date and (per RWeekly) references this clip from “It’s the Great Pumpkin, Charlie Brown”.

For the official announcement from the R core team including the detailed list of changes, see the posting in the R-announce mailing list linked below.
R-announce mailing list: R 3.3.2 is released
 

from Revolutions http://blog.revolutionanalytics.com/2016/10/r-332-now-available.html

Introducing List view for managing your reports

Last week at PASS Summit 2016, we demonstrated many of the enhancements in SQL Server 2016 Reporting Services, but we also showed off a couple of enhancements we’ve made to the Reporting Services web portal since that release in June. Let’s take a look at them!

List view

When you access the new Reporting Services web portal, you see your content in a Tiles view. With your key performance indicators (KPIs) at the top and your reports and files organized by type, Tiles view is great for monitoring your business at a glance or for browsing your reports.

Previous versions of Reporting Services offered a “Details” view as well, and while we wanted to create a modern version for SSRS 2016, we couldn’t quite squeeze it into RTM. Since then, we’ve heard from many of who you love the new web portal and Tiles view (and KPIs!), but do miss having a “Details” view.

We’re happy to say that “Details” view is back – and better than before in the form of a new List view:

image

With List view, you can

  • See descriptions and other details at a glance
  • Sort (for example, to find the most recently-modified items)
  • Move or delete many items at once

You can switch between Tiles and List view from the View menu:

image

And in a nice enhancement over previous versions of Reporting Services, the new web portal remembers your selection even after you close your web browser, so if you prefer to work in List view, you can choose it once and start there every day.

Context menu

Another point of feedback we heard from you is that you like the ability to right-click a report and see a context menu with useful metadata (e.g., who changed the report and when), but wish you could perform some common tasks without the extra click through the “Manage” page:

image

We heard you and we’ve revamped the context menu so common tasks are only a click away, from editing a report, to downloading a copy, to moving it to another folder:

image

Try it now and send us your feedback

We’re pleased to say that we’re including both List view and the new context menu in an update for SSRS 2016 coming later this fall, and if you’d like to try them today, we’ve included them in the Technical Preview as well.

from Business Intelligence Blogs https://blogs.msdn.microsoft.com/sqlrsteamblog/2016/10/31/introducing-list-view-for-managing-your-reports/

Splunking Kafka At Scale

At Splunk, we love data and we’re not picky about how you get it to us. We’re all about being open, flexible and scaling to meet your needs. We realize that not everybody has the need or desire to install the Universal Forwarder to send data to Splunk. That’s why we created the HTTP Event Collector. This has opened the door to getting a cornucopia of new data sources into Splunk, reliably and at scale.

We’re seeing more customers in Major Accounts looking to integrate their Pub/Sub message brokers with Splunk. Kafka is the most popular message broker that we’re seeing out there but Google Cloud Pub/Sub is starting to make some noise. I’ve been asked multiple times for guidance on the best way to consume data from Kafka.

In the past I’ve just directed people to our officially supported technology add-on for Kafka on Splunkbase. It works well for simple Kafka instances, but if you have a large Kafka cluster comprised of high throughput topics with tens to hundreds of partitions, it has its limitations. The first is that management is cumbersome. It has multiple configuration topologies and requires multiple collection nodes to facilitate data collection for the given topics. The second is that each data collection node is a simple consumer (single process) with no ability to auto-balance across the other ingest nodes. If you point it to a topic it will take ownership of all partitions on the topic and consumes via round-robin across the partitions. If your busy topic has many partitions, this won’t scale well and you’ll lag reading the data. You can scale by creating a dedicated input for each partition in the topic and manually assigning ownership of a partition number to each input, but that’s not ideal and creates a burden in configuration overhead. The other issue is that if any worker process dies, the data won’t get read for its assigned partition until it starts back up. Lastly, it requires a full Splunk instance or Splunk Heavy Forwarder to collect the data and forward it to your indexers.

Due to the limitations stated above, a handful of customers have created their own integrations. Unfortunately, nobody has shared what they’ve built or what drivers they’re using. I’ve created an integration in Python using PyKafka, Requests and the Splunk HTTP Event Collector. I wanted to share the code so anybody can use it as a starting point for their Kafka integrations with Splunk. Use it as is or fork it and modify it to suit your needs.

Why should you consider using this integration over the Splunk TA? The first is scalability and availability. The code uses a PyKafka balanced consumer. The balanced consumer coordinates state for several consumers who share a single topic by talking to the Kafka broker and directly to Zookeeper. It registers a consumer group id that is associated with several consumer processes to balance consumption across the topic. If any consumer dies, a rebalance across the remaining available consumers will take place which guarantees you will always consume 100% of your pipeline given available consumers. This allows you to scale, giving you parallelism and high availability in consumption. The code also takes advantage of multiple CPU cores using Python multiprocessing. You can spawn as many consumers as available cores to distribute the workload efficiently. If a single collection node doesn’t keep up with your topic, you can scale horizontally by adding more collection nodes and assigning them to the same consumer group id.

The second reason you should consider using it is the simplified configuration. The code uses a YAML config file that is very well documented and easy to understand. Once you have a base config for your topic, you can lay it over all the collection nodes using your favorite configuration management tool (Chef, Puppet, Ansible, et al.) and modify the number of workers according to the number of cores you want to allocate to data collection (or set to auto to use all available cores).

The other piece you’ll need is a highly available HTTP Event Collector tier to receive the data and forward it on to your Splunk indexers. I’d recommend scenario 3 outlined in the distributed deployment guide for the HEC. It’s comprised of a load balancer and a tier of N HTTP Event Collector instances which are managed by the deployment server.

scenario3

The code utilizes the new HEC RAW endpoint so anything that passes through will go through the Splunk event pipeline (props and transforms). This will require Splunk version >= 6.4.0.

Once you’ve got your HEC tier configured, inputs created and your Kafka pipeline flowing with data you’re all set. Just fire up as many instances as necessary for the topics you want to Splunk and you’re off to the races! Feel free to contribute to the code or raise issues and make feature requests on the Github page.

Get the code

from Splunk Blogs http://blogs.splunk.com/2016/10/31/splunking-kafka-at-scale/

Today’s Artificial Intelligence Does Not Justify Basic Income

Even the simplest jobs require skills—like creative problem solving—that AI systems cannot yet perform competently.

from Robotics – MIT Technology Review https://www.technologyreview.com/s/602747/todays-artificial-intelligence-does-not-justify-basic-income/

Power BI Desktop October Feature Summary

Today we’re releasing the October Power BI Desktop update, which is filled with several exciting new features! We’ve added several new reporting features, including the much asked for date slicer and snap to grid. We also are releasing several new analytical features including grouping and a top N filter.

from Business Intelligence Blogs https://powerbi.microsoft.com/en-us/blog/power-bi-desktop-october-feature-summary/

Power BI Desktop October Feature Summary

Today we’re releasing the October Power BI Desktop update, which is filled with several exciting new features! We’ve added several new reporting features, including the much asked for date slicer and snap to grid. We also are releasing several new analytical features including grouping and a top N filter.

from Microsoft Power BI Blog | Microsoft Power BI https://powerbi.microsoft.com/en-us/blog/power-bi-desktop-october-feature-summary/