July 8, 2009
Even though I really like the commercials for Microsoft’s Bing that anthropomorphize the absurdity of irrelevant search results, they don’t really hit on what I think is the fundamental problem facing someone confronted with an Internet-sized chunk of data – figuring out what to search for in the first place. Irrelevant search results are definitely annoying, but it usually only takes a minute or so to find some combination of terminology and quotes that generates a couple of pages of good hits. The problem that consumes far more time, and the problem that I just plain fail to solve most often, is trying to identify what I don’t know that I don’t know. What question should I even ask to cut through my overwhelming ignorance of some new domain? It’s kind of like the problem I had once ordering food in a Japanese restaurant.
The menus in this particular Japanese restaurant were entirely in Japanese. English pronunciations were provided (e.g. “ten zaru soba”) but there was no other information about each entrée (I least, I think they were entrées). Since I don’t speak Japanese and I had no knowledge of Japanese cuisine, it seemed all too easy to accidentally order a dish made with something famously-disgusting to western palettes like sea-urchin or even potentially life-threatening like poisonous pufferfish. The waitress seemed helpful enough to tolerate maybe one or two questions from her ignorant customer, but the question “what do you think I want to eat?” didn’t seem like it would get me good results.
But as the waitress approached, I noticed that the menu was organized into categories. It didn’t matter that the headings for each group of dishes were just as impenetrable to me as the names of the entrées, because now I knew at least one thing that I didn’t know.
“What is this?” I asked pointing to a heading.
“Noodles.”
And under the noodle group, I suddenly observed that the English-alphabet pronunciation of lot of the dishes ended in the word “soba”. It was like there was a sub-category of noodles that had some kind of soba-like nature. Another thing I now knew that I didn’t know.
“What does soba mean?”
“Those are buckwheat noodles.”
I ordered something called “ten zaru soba” and was very pleased with the result delivered from the kitchen.
It may already be obvious how this anecdote relates to DAC’s analytic products and technology, but I’ll hammer the point home anyway. In cases where the data is overwhelming unfamiliar (as in the Japanese restaurant), or overwhelmingly large, as it is in intelligence data sets for country-sized regions, even a small revelation about the structure or categorization of the data can go a long way. And DAC’s technology aggressively reveals both large and small aspects of the organization and structure of even the most eclectic data. And it does so on large and small scales and with adjustable levels of fidelity. DAC’s tools and technology make it possible to answer the vague but important questions like “What is all of this data about?” and “What kinds of things could I search for?”
Filed under:
BOBCAT by Peter David
May 8, 2009
I was recently challenged by my boss by a revenue goal over the next three years. Revenue goals are kind of this necessary evil for those of us managers in the technology arena. We have to have them, but we don’t necessarily like them. Also, I have come to the conclusion that I cannot get up in front of my staff and say “We’re going to set a goal of $XM in the next three years!” where X is some large number, and have them get excited. It’s well documented that this just does not excite the technical folks.
So I started thinking about all of the cool work that we are doing here at DAC. I started thinking about ways that I can take this revenue goal and translate into specific actions that will get our superb staff pumped up and excited. We have quite a few interesting products, projects, and technologies in our company so as I was driving home from the meeting I started asking myself “What goal could we put in the ground for each product, project, or technology we have over the next year that would really knock people back on their heels? What would be so impressive that others would just be completely bewildered that we can do something so impressive?”
I got home and started writing ideas down. Here is the initial list:
- To store and index UAV video in a way that makes it searchable based on observed activities.
- The ability to identify objects in UAV video and automatically understand their behavior.
- The ability to identify and predict a terrorist attack in the maritime domain.
- A revolutionary way to visualize relationships between entities and events that separates us from the traditional “entity-network view”.
- To predict an event (or some suspicious activities) based on the input of vast amounts of structured and unstructured data.
- To predict a cyber attack is about to occur against the DAC network.
As I said, this is just the initial list. I have challenged the staff to come up with their own ideas of a great demonstration over the next year. Over the next two weeks we should flush out these ideas, assign “owners” to the challenges and press forward. Those ideas without owners will be dropped from the list. The plan is to demo these capabilities on real data at our next annual meeting in the March timeframe of next year.
As we refine the list over the next few weeks I will post it and you can watch us make progress.
April 20, 2009
Welcome and thank you for checking out the DAC blog. We have started this blog to provide a window into the work that we are doing at Decisive Analytics Corporation. While we do many things at DAC, the content of this blog will come from those of us inside DAC that are developing cutting edge technologies that are offered through one of three divisions within the company:
- Analytical Products Division - APD is headed by Jessica and is responsible for our BOBCAT product suite. BOBCAT is an analytical tool that provides guided prediction through large amounts of unstructured text. By guided prediction, I mean BOBCAT enables you to navigate your way through large amounts of unstructured data (really anything that can be represented as text) based on automatically discovered themes and relationships. More info on BOBCAT can be found over at the product website http://www.dac.us/products/bobcat/. BOBCAT is largely being applied to the counterterrorism problem, being used to identify suspicious activities and relationships between terrorists identified in text.
-
Video & Imaging Division- VID is headed by Tim and is responsible for our Mainship product suite. Mainship is a Media Asset Management (MAM) system which collects, indexes, and stores video assets including open source intelligence and broadcast news. Mainship can work in virtually any language and supports many media types. Mainship is being used today in some of the largest government locations to monitor open source news and provide analysts access to open source situational awareness.
-
Technology Innovation Division- TID is headed by Mike. He is charged with working on difficult data fusion techniques that feed our two products, as well as identifying new areas where we can apply the mathematical techniques we have developed. Mikes current areas of interest include maritime domain awareness, cyber security, entity disambiguation, and CBRN data fusion among others.
I thought it might be nice to give a simple overview of the organization that will be contributing to this blog. We plan to use this blog, our twitter accounts, and our website to interact publicly with our customers and partners. We will be opening this blog up to many more on the staff to contribute. I look forward to the discussion that will follow.