{"id":3009811,"date":"2025-07-15T08:00:00","date_gmt":"2025-07-15T15:00:00","guid":{"rendered":"urn:uuid:49c3b29e-d3f1-4fe4-8cc6-787f7635623e"},"modified":"2025-07-11T10:02:47","modified_gmt":"2025-07-11T17:02:47","slug":"generating-6dof-object-manipulation-trajectories","status":"publish","type":"research-post","link":"https:\/\/sonyinteractive.com\/en\/innovation\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/","title":{"rendered":"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision"},"content":{"rendered":"\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:70%\">\t\t<div\n\t\t\tclass=\"wp-block-sie-social-share social-share--raw\"\n\t\t\tdata-aa-modulename=\"social-share\"\n\t\t\tdata-url=\"https:\/\/sonyinteractive.com\/en\/innovation\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/\"\n\t\t\tdata-title=\"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision\"\n\t\t\tdata-heading=\"Share this\"\n\t\t\tdata-bluesky-title=\"\"\n\t\t\tdata-twitter-title=\"\"\n\t\t\tdata-twitter-hashtags=\"\"\n\t\t\tdata-reddit-title=\"\"\n\t\t\tdata-email-subject=\"\"\n\t\t\tdata-email-body=\"\"\n\t\t\t>\n\t\t<\/div>\n\t\n\n<div class=\"post-author\" data-no-of-bylines=\"4\"><ul><li>Taichi Nishimura<span class=\"sie-style-body-small\">Sr Machine Learning Engineer, Sony Interactive Entertainment<\/span><\/li><li>Tomoya Yoshida<span class=\"sie-style-body-small\">Kyoto University<\/span><\/li><li>Shuhei Kurita<span class=\"sie-style-body-small\">National Institute of Informatics<\/span><\/li><li>Shinsuke Mori<span class=\"sie-style-body-small\">Kyoto University<\/span><\/li><\/ul><\/div>\n\n\n<p class=\"sie-paragraph\">Learning to use tools or objects in common scenes, particularly handling them in various ways as instructed, is a key challenge for developing interactive robots. Training models to generate such manipulation trajectories requires a large and diverse collection of detailed manipulation demonstrations for various objects, which is nearly unfeasible to gather at scale. In this paper, we propose a framework that leverages large-scale ego- and exo-centric video datasets &#8212; constructed globally with substantial effort &#8212; of Exo-Ego4D to extract diverse manipulation trajectories at scale. From these extracted trajectories with the associated textual action description, we develop trajectory generation models based on visual and point cloud-based language models. In the recently proposed egocentric vision-based in-a-quality trajectory dataset of HOT3D, we confirmed that our models successfully generate valid object trajectories, establishing a training dataset and baseline models for the novel task of generating 6DoF manipulation trajectories from action descriptions in egocentric vision.<\/p>\n\n\n\n<p class=\"sie-paragraph\">Read the full paper here: <a class=\"sie-paragraph\" href=\"https:\/\/arxiv.org\/abs\/2506.03605\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/arxiv.org\/abs\/2506.03605<\/a><\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n\t\t<div class=\"related-posts__container\" data-selection-type=\"latest\" data-post-type=\"research-post\">\n\t\t\t<h2 class=\"related-posts__heading sie-style-h5\">Latest Research Posts<\/h2>\n\t\t\t<div class=\"related-posts related-posts--vertical\">\n\t\t\t\t<article class=\"related-post related-post--research-post\"><div class=\"related-post__content\"><h3 class=\"related-post__title sie-style-body-small-v2\">\n\t\t<a href=\"https:\/\/sonyinteractive.com\/en\/innovation\/research-academia\/research\/content-adaptive-encoding-for-interactive-game-streaming\/\">\n\t\t\tContent Adaptive Encoding for Interactive Game Streaming\n\t\t<\/a>\n\t<\/h3><\/div><\/article><article class=\"related-post related-post--research-post\"><div class=\"related-post__content\"><h3 class=\"related-post__title sie-style-body-small-v2\">\n\t\t<a href=\"https:\/\/sonyinteractive.com\/en\/innovation\/research-academia\/research\/vision-language-models-for-quality-assurance\/\">\n\t\t\tVideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance\n\t\t<\/a>\n\t<\/h3><\/div><\/article><article class=\"related-post related-post--research-post\"><div class=\"related-post__content\"><h3 class=\"related-post__title sie-style-body-small-v2\">\n\t\t<a href=\"https:\/\/sonyinteractive.com\/en\/innovation\/research-academia\/research\/learning-representations-in-video-game-agents\/\">\n\t\t\tLearning Representations in Video Game Agents with Supervised Contrastive Imitation Learning\n\t\t<\/a>\n\t<\/h3><\/div><\/article>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-sie-scroll-to-top\" data-aa-modulename=\"sie-scroll-to-top\"><button class=\"sie-btn sie-btn--action\" type=\"button\"><span>Back to top<\/span><\/button><\/div>\n","protected":false},"author":34,"parent":0,"template":"","byline":[316,317,318,319],"research-post-category":[201],"class_list":["post-3009811","research-post","type-research-post","status-publish","hentry","research-post-category-robotics","post-generating-6dof-object-manipulation-trajectories"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.0 (Yoast SEO v27.0) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision - Sony Interactive Entertainment<\/title>\n<meta name=\"description\" content=\"Discover SIE&#039;s recent research on generating 6DoF object manipulation trajectories from action description in egocentric vision.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision\" \/>\n<meta property=\"og:description\" content=\"Discover SIE&#039;s recent research on generating 6DoF object manipulation trajectories from action description in egocentric vision.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/\" \/>\n<meta property=\"og:site_name\" content=\"Sony Interactive Entertainment\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"1 minute\" \/>\n\t<meta name=\"twitter:label2\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data2\" content=\"Taichi Nishimura\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/\",\"url\":\"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/\",\"name\":\"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision - Sony Interactive Entertainment\",\"isPartOf\":{\"@id\":\"https:\/\/sonyinteractive.com\/en\/#website\"},\"datePublished\":\"2025-07-15T15:00:00+00:00\",\"description\":\"Discover SIE's recent research on generating 6DoF object manipulation trajectories from action description in egocentric vision.\",\"breadcrumb\":{\"@id\":\"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/sonyinteractive.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/sonyinteractive.com\/en\/#website\",\"url\":\"https:\/\/sonyinteractive.com\/en\/\",\"name\":\"Sony Interactive Entertainment\",\"description\":\"Pushing the Boundaries of Play\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/sonyinteractive.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision - Sony Interactive Entertainment","description":"Discover SIE's recent research on generating 6DoF object manipulation trajectories from action description in egocentric vision.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/","og_locale":"en_US","og_type":"article","og_title":"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision","og_description":"Discover SIE's recent research on generating 6DoF object manipulation trajectories from action description in egocentric vision.","og_url":"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/","og_site_name":"Sony Interactive Entertainment","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"1 minute","Written by":"Taichi Nishimura"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/","url":"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/","name":"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision - Sony Interactive Entertainment","isPartOf":{"@id":"https:\/\/sonyinteractive.com\/en\/#website"},"datePublished":"2025-07-15T15:00:00+00:00","description":"Discover SIE's recent research on generating 6DoF object manipulation trajectories from action description in egocentric vision.","breadcrumb":{"@id":"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/sonyinteractive.com\/en\/innovation\/technology\/research-academia\/research\/generating-6dof-object-manipulation-trajectories\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/sonyinteractive.com\/en\/"},{"@type":"ListItem","position":2,"name":"Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision"}]},{"@type":"WebSite","@id":"https:\/\/sonyinteractive.com\/en\/#website","url":"https:\/\/sonyinteractive.com\/en\/","name":"Sony Interactive Entertainment","description":"Pushing the Boundaries of Play","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/sonyinteractive.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"ab_tests":{},"_links":{"self":[{"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/research-post\/3009811","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/research-post"}],"about":[{"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/types\/research-post"}],"author":[{"embeddable":true,"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/users\/34"}],"version-history":[{"count":2,"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/research-post\/3009811\/revisions"}],"predecessor-version":[{"id":3009813,"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/research-post\/3009811\/revisions\/3009813"}],"wp:attachment":[{"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/media?parent=3009811"}],"wp:term":[{"taxonomy":"byline","embeddable":true,"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/byline?post=3009811"},{"taxonomy":"research-post-category","embeddable":true,"href":"https:\/\/sonyinteractive.com\/en\/wp-json\/wp\/v2\/research-post-category?post=3009811"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}