Emerging Into the Data Age 走入数据时代
What we really want to see emerging in 2021 is all of us, from our homes. 其实，2021年我们最希望看到的是，我们自己站起来，走出来 – 从我们各自蜗居了差不多一年的家里。By summer, a semblance of normalcy and stability should begin asserting itself globally. 到今年夏天，全球范围内应该开始出现至少在表面上回归正常和稳定。Old patterns will return, but with changes. 旧模式会回归，但是带来一些变化。Other regions may maintain the habit of wearing face masks in public even after the pandemic 其他地方可能会在疫情过后仍然坚持公共场合带口罩的习惯。. Temperature checks as you board a plane may be a thing, 以后坐飞机时，检查体温可能会成为正式登机流程之一，and restaurants might never feel crowded again 另外，餐馆可能再也不会出现拥挤的场景了。.
A pattern that wasn’t interrupted by the pandemic 有一种模型并没有被疫情打断, but instead shifted into overdrive 相反还进入了高速通道， is digital transformation 这就是数据转型. The Data Age 数据时代— defined by greater interconnectedness through ubiquitous digital technologies 具体特征就是无处不在的数字科技，还有大范围的互联互通 — was already here 对此我们都不陌生. Now it’s really here 现在，它已经来到了我们所有人的身边.
“COVID-19 has been a catalyst 新冠病毒是一个催化剂, greatly accelerating digital transformation 极大地加速了电子转型的进程,” says Ammar Maraqa 说, Splunk’s senior vice president 他是SPLUNK的副总裁and chief strategy officer 和首席战略官. “For business and IT leaders对商业和IT领袖们而言, the strategic long view has been dramatically compressed 他们的长期战略计划被大大地压缩了. Disruptions anticipated in five or 10 years 5到10年一遇的意外情况 have been compressed to a horizon of months or weeks. 被压缩到了几个月甚至几个星期之内”
For a lot of organizations 对很多组织来说, high-speed transformation is going to look like a lot more cloud 高速转型很可能就是向云端进发. Organizations that were already fairly mature in their cloud adoption 那些已经在云端发展得很成熟的组织 are pushing into automation 如今在朝自动化前进 and machine learning 还有机器学习. Orgs that can really push the envelope 还有一些走在队伍最前列的组织机构 are planning for 5G 则瞄准了5G在做规划, investigating augmented reality or blockchain 或者在研究应用增强虚拟现实和区块链的可能性, or deploying edge computing solutions 又或者，在部署边缘计算技术解决方案.
A lot of powerful software 很多强大的软件 is in our near future 离我们已经很近了 — in large part 很大程度上, says Splunk’s head of mobile engineering SPLUNK的移动技术部门负责人, Jesse Chor 说, because of a key hardware development 是因为一种关键硬件设备的发展: the rise of the graphics processing unit (GPU) 也就是显卡的崛起. The central processing unit 中央处理器, the computer’s brain 是电脑的大脑 and a fundamental determiner of how fast a program can run 也是决定一个程序可以运转多快的最核心部件, continues to steadily improve 仍然在稳定发展. But in recent years 但是近几年, the GPU 显卡— originally designed for 3D graphics 本来是为3D绘图设计，and therefore central to gaming 因此对游戏竞技举足轻重 — is increasingly important 正在变得越来越重要 to use cases beyond “first-person shooter.”应用范围不仅限于第一人称视角设计游戏 Playing the GameSure 在玩 GAMESURE这个游戏时, gaming isn’t the only use of advancing GPUs 游戏不是显卡的唯一用途, but it’s still a big one 但仍然是非常重要的应用领域.
Splunk’s chief technology officer SPLUNK 公司的首席技术官, Tim Tully, says the improvement in GPUs 说显卡的发展 makes gaming a more interesting tech space than ever 使得游戏成了一个前所未有的有趣科技领域. He also admits 他也承认，that since the pandemic 自从疫情爆发以来, he has become a more regular gamer 他自己也成了一个花不少时间玩游戏的玩家. “GPU advancements are allowing developers to make these super-immersive games 显卡的发展允许游戏开发者制作出一些非常逼真的游戏, and VR will be an interesting element there 虚拟现实技术将会是这个领域一个中很有趣的元素,” he says 他说道. “And since the pandemic 自从疫情爆发以来, gaming has become an even bigger activity 游戏已经成了一个影响很大的活动, a more important escape mechanism 一个更重要的逃离现实的途径, than ever 从未有过的重要途径. I think it will consume more of the budget we used to spend in restaurants or movie theaters 我想，它会让我们以后在餐馆和电影院少花钱，在游戏中多花钱.”
“The evolution of GPU 显卡的进化 is going to be a big one 将是一个大事件,” Chor says 说道, “because that’s what unlocks machine learning on edge 因为那将是解开机器学习枷锁的钥匙, that’s what unlocks AR 还可以解开增强现实技术的应用, that’s what unlocks all these great future experiences 它可以解开所有这些未来的美妙体验. And then if you combine that with the low latency and high bandwidth of 5G and WiFi 6 如果你在把它和5G以及WIFI 6的低延迟和高带宽结合在一起, you’ve got everything you need 那就万事俱备了.”This year’s predictions for emerging technologies include a number of ways to keep the most powerful GPUs busy 今年的前沿科技预测报告包括一系列可以让最强大的显卡忙碌起来的方法.
Predictions and Survival Strategies for 2021
Artificial Intelligence/Machine Learning (AI/ML) 人工智能/机器学习 Self-learning ML 自我学习 will help us see beyond buzzwords to value Challenges 将会帮助的视线穿越迷雾，看清前面的挑战: Adversarial attacks 来自对手的攻击, AI ethics 人工智能的道德, human training 人类的培训
5G will be delayed 5G在今年的发展会被延迟, then big 然后是高速发展. Meanwhile 与此同时, look at WiFi 6 先来看看WIFI 6这种技术. Also 另外还值得关注的有: Contactless payments 无接触付款, two-factor authentication 双因素认证and biometrics 还有生物测定技术.
AR 增强现实: Pandemic setbacks 疫情的牵扯 may lead to immersive collaborative tools 可能会使得一些逼真的合作工具的出现，and healthcare 还有这种技术在医疗领域的应用发展.
VR 虚拟现实: The hardware’s there 硬件已经有了. It’s do-or-die on the software front 在软件这一块，则是一个要么做，要么死的选择 .
Biometrics was already taking off before the pandemic 生物测定技术在疫情之前已经起步了.
The marketplace finally gets blockchain 市场上终于出现了区块链的身影; it’s focused on efficiencies today 目前来说，它的应用集中在提高效率方面, and will see wilder possibilities in the years ahead 未来的几年里，我们会看到更多的应用可能性.
Edge Computing 边缘技术技术
Where it all comes together 这是一个所有前沿技术综合应用的场景.
Artificial Intelligence 人工智能/Machine Learning 机器学习 (AI/ML)
The adoption of AI/ML technologies AI和ML 技术的应用 was already under way when the pandemic hit 在疫情爆发时已经开始上路, but since COVID-19 disrupted public health 但是既然新冠病情打扰了公共卫生, the economy 经济 and pretty much every other aspect of how we live and work 还有我们工作生活中的几乎所有方面, organizations have significantly sped up their incorporation of machine learning algorithms 组织机构纷纷加快了他们将机器计算算法融入工作中的进程.
“We’re seeing 我们看到的情况是， that particularly, but not exclusively, with security use cases 这种技术主要应用在安全领域，但不是只在安全领域,” says Ram Sriharsha 说, Splunk’s head of machine learning 他是SPLUNK的机器学习部门负责人. For algorithms already in use 对那些已经进入应用领域的算法, the pandemic has created challenges 疫情让它们面临着挑战. Predictive retail algorithms falter when our behaviors change suddenly and significantly 当我们的行为突然发生突然的变化时，那些预测性的零售领域算法就变得不适应了. As the MIT Technology Review noted in May 麻省理工学院的科技评论杂志在五月份一篇文章中指出, “Machine-learning models trained on normal human behavior 基于正常的人类行为训练出来的机器学习模型 are now finding that normal has changed 现在发现所谓的正常已经改变了, and some are no longer working as they should 有些算法根本就不再有作用.”
In other words 换句话说, online retailers’ recommendation engines weren’t quite ready back in March 三月份的时候，那些在线零售网点的货品推荐就没发挥什么作用 for everyone to suddenly care about nothing but toilet paper and hand sanitizer 因为在那个时候，人们唯一想购买的就是卷筒纸和洗手液.
At this point 当下, machine learning has mostly caught up with the changes caused by the pandemic and recession 机器学习基本上已经跟上了疫情和经济萧条所带来的改变. And across industries 各行各业, algorithms have been, if not commoditized 如果不是已经被商业化了, then democratized就是被民主化了. The major cloud providers 那些主要的云服务提供商 are offering the hardware and software 开始将硬件和软件 to bring the power of machine learning 携带者机器学习的力量 to their customers 带给顾客. Those providers and other third-party vendors 那些主要提供商和第三方分销商 are delivering “AI as a service.” 开始将人工智能当成一项服务出售 And, more importantly 更重要的是, the idea of artificial intelligence 人工智能的主张 has taken hold in the corporate imagination 已经在一些大公司的发展计划当中了.
AI/ML will be held back by its own limitations AI/ML 将被其自身的限制所牵扯— until it can learn on its own 直到它可以靠自己学习.
One thing holding back AI/ML adoption is the resource overhead 限制AI/ML 应用的情况之一是资源消耗, says Ram Sriharsha 说, Splunk’s head of machine learning 他是SPLUNK 的机器学习部门的负责人.
“The problem with the traditional ML pattern 传统的机器学习模型的问题, in which people spend a lot of time building and deploying models 人们花费很多时间建造和安置的那些模型, is that it just doesn’t scale 是他们跟不上节奏,” he says 他说. “Organizations are evolving at a much faster rate 组织机构的发展速度要快得多, and the questions you’re asking your data are evolving 他们希望这些数据也快速进化. You can’t hire data science teams fast enough to keep up 但他们找不到速度足够快的数据工程师团队.”
The answer 解决这个问题的钥匙, he says 他说, is to automate the learning in machine learning 就是让机器学习自动化. “You not only have to automate the process of creating models and deploying them 你不仅需要让创建模型和配置模型的过程自动化, you have to automate the process of learning and relearning 还需要让学习和再学习的过程自动化.”
A challenge closely connected to the human-intensive process 对那些需要大量人力的程序来说，一个挑战, he notes 他说道, is the fact that most machine learning models 就是大多数机器学习模型rely on well-structured, clearly labeled data to learn都需要结构良好，清晰标注的数据作为基础.
“You’re going to see increasing amounts of R&D energy 你会看到，越来越多的科研资源 trying to solve these two problems 在尝试解决这两个问题, which is how to make the algorithms learn with as few labels and as little human input as possible 就是研究如何让算法尽可能使用少一些标签，少一些人工投入，来学习,” Sriharsha says 说道. “The more I can throw ML at unstructured data 如果我能丢给机器学习算法很多没有结构的数据，and have the algorithm figure out how to extract what it needs from the data 让算法自己从数据中整理出所需要的信息, the more powerful its contribution to the organization will be 那么它给组织带来的贡献就会大很多.”
It’s a lot harder to get an algorithmic model to work with the shifting, unstructured sources of data 要让算法模型去对付那些变化的、没有结构的数据资源，很难 that would drive better retail recommendations or supply chain refinements 这些模型需要提供更聪明的零售建议，或者提供连锁调整 than it is 下面这种情况就要容易得多，to get a model to master structured board games 让模型去掌握那些有结构的棋类 like chess and go 比如说国际象棋和围棋, but that’s what CIOs want 但这就是那些CIO们想要的. It’s definitely what our CIO wants 他们一心一意就想要这个.
“I’m keeping my eyes on self-learning systems 我一直在关注自我学习系统,” says Splunk 的 CIO Steve McMahon 说. “I want my process automation largely to be self-aware 我希望我的程序自动化能够尽可能地具有自我意识, so to speak 可以这么说, and learning 还有自我学习 so that it can identify the greatest opportunities 所以它能够识别最大的机会.”
Defense against adversarial learning will improve in the next few years. Because it has to.
Last year, our predictions report warned of the potential threat of AI sabotage: You can poison the outcomes of AI-driven automation by poisoning the data it learns from. We gave the example of tricking an autonomous vehicle into misunderstanding a stop sign. In September, researchers found that a tiny sticker on an object the size of a fighter jet could hide it from an AI processing drone footage. The threat of data deception remains on the horizon, and a new area of research will have to rise to the challenge, because today’s AI is as naive as a week-old puppy.
“Machine learning algorithms trust the data they learn from,” Ram Sriharsha says. “But what happens if people are trying to hack you? As an industry, we haven’t thought carefully about how to learn in the presence of adversaries.”
He says that researchers will need to explore how to make their models robust against adversaries. And he says that now is the time to develop those techniques, because the potential power of such attacks will grow thanks to standard market forces.
“In time, there will not be hundreds of machine-learning startups selling hundreds of machine-learning platforms,” Sriharsha says. “There will be a few, or one.”
And just like the dominance of Microsoft’s operating system gave hackers one big target, a small number of dominant AI platforms would draw all the attacks.
“Once that market consolidates around one platform that almost everybody is using, hackers are really incentivized to figure out how to break it,” he says. “With that kind of adversarial attention, we have to spend a lot of energy right now to build robust algorithms that can withstand attack.”
Look beyond AI’s buzzword heat to get real, meaningful value from AI/ML.
AI is often a black box, a vague promise, a hope for a Star Trek future. Certainly vendors are slapping “Now with AI!” on products like it’s extra raisins in your Raisin Bran. And corporate customers are getting caught up in the ill-defined excitement.
“A lot of customers won’t even think about using your product if it doesn’t have AI built into it, or the potential for it to be integrated as soon as the initial use cases or outcomes are derived,” says John Sabino, Splunk’s chief customer officer. “But a lot of the time when people ask for AI, they’re just checking a box. They’re not sure what it is, but they’re afraid of being left behind.”
Simon Davies, vice president of Splunk in APAC, says that organizations in the Asia-Pacific region are already outgrowing the buzzword phase. “They’ve moved on from AI being something that you specifically think about to being a core part of any type of decision or technology,” he says. “Instead, the conversation is about ‘How can your platform assist us?’”
Delivering an AI-based product, that’s a vendor’s job. But how an organization uses AI to be more competitive or deliver better outcomes, that’s a strategic consideration. Which is important for an organization to consider, Sabino adds. “You’re looking to build relationships with providers who really understand how to leverage AI, and who really understand your business and use cases, because this is a strategic relationship. It’s how a company in a competitive industry can leapfrog the competition.”
Prediction：Increased attention to the challenges of ML bias will build ethical responsibilities into engineers’ job descriptions.
As we leave more decisions to algorithms, there will be increased attention to how they’re making those decisions. The utopian vision of fair outcomes derived from dispassionate examination of objective data overlooks the matter of who selects the data sets and designs the algorithms themselves: flawed and inevitably biased human beings.
We’ve seen examples of bias already. Our predictions last year noted the adversarial learning example of jerks (technical term) teaching a crowdsourced chatbot to be racist. Algorithmic bias in the mortgage industry is a well-known failing, and facial recognition hit a wall amid multiple controversies in 2020.
A July paper in Royal Society Open Science discussed the problem of “unethical optimization” and mathematical methods for detecting and eliminating such biases. Ram Sriharsha expects that pursuit to take on added importance in the next few years. But he says that the goal of completely eliminating bias is doomed to failure, so it has to be coupled with a goal to at least understand bias when it (inevitably) occurs.
“Take the classic example of loan applications being racially biased, even when it’s not an intention but an inadvertent result of the data you fed it,” Sriharsha says. “You can’t teach an algorithm, ‘Recognize race and don’t be a jerk about it.’ So explainability is going to be crucial.”
If we can understand how an algorithmic model produced an objectionable outcome, we can more quickly adjust it to produce better, fairer conclusions. Was the training data biased because it didn’t accurately reflect the full ethnic and gender makeup of society? Does a particular data set reflect societal biases that the model is then reinforcing? Are there other data sets that would contribute to a clearer picture of our society, and produce outcomes that better align with an organization’s or community’s values?
Explainability can help us fix errant models as they err, but the bigger question, Sriharsha says, is how we handle ethical issues up front, before the models are turned loose.
“It starts with education, evolving the course work we use to train future computer science engineers,” he says. “But more than that, we’re already seeing partnerships designed to prevent these unintended biases. If computer scientists in the past have worked in isolation, now they’re working with ethicists, economists and sociologists to understand the societal implications of certain models.”
If we can’t teach every software engineer to be a sociologist, urban planner, community activist and moral philosopher, we can at least make sure the engineer has them all on speed dial.
These consultations, and the very act of considering wider ranges of consequence, mean that better, fairer models will take more time and consideration to build. Problem: “Let’s slow down and think this through” is not typical Silicon Valley cocktail conversation.“
‘Move fast and break things’ has been the mantra of Silicon Valley,” Sriharsha notes. “And I think that for Silicon Valley entrepreneurship to remain at the top, you have to move fast. Especially in a rapidly evolving field like AI. But we’ll probably have to learn to move fast and not break things.”
Machine learning will help speed the discovery of new medicines — in part by looking at previous “failures.”
This prediction looks at one of the fields where data obsessed us all in 2020: healthcare. The current crisis is finding new applications of artificial intelligence, with the healthcare sector seen as a major growth area. For instance, the U.S. Centers for Disease Control developed a coronavirus chatbot that uses AI to tell you whether you need to go to the hospital.
But a challenge in applying machine learning to healthcare is that it’s a human-intensive field, in which a lot of what happens in terms of treatment and research moves, by necessity, at a human pace. And the smallish numbers involved in clinical trials or individual treatments don’t really cry out for advanced algorithmic support, Sriharsha notes.
“Where I feel AI/ML make the most sense in medicine today is not in current trials, but in the thousands that have already occurred,” he says. “Clinical trials are extremely expensive. Companies spend billions of dollars on them, and many fail. There are vast troves of data on the structure of drugs t that didn’t move forward, but might have undiscovered promise.
Some clinical trials are stopped not because a drug fails, he says, but because it wasn’t effective enough in an envisioned scenario. But it’s possible that those drugs, or new structural variations of them, could be more effective in other important scenarios. This existing, unused data could help researchers more quickly zero in on new, effective medicines.
“If you want to see the biggest impact of AI in healthcare and life sciences in the next 10 years,” Sriharsha says, “probably that is where it’s going to happen.”