{"id":67,"date":"2025-05-05T16:57:37","date_gmt":"2025-05-05T14:57:37","guid":{"rendered":"https:\/\/projects.dimes.unical.it\/music4d\/?page_id=67"},"modified":"2025-12-17T19:41:01","modified_gmt":"2025-12-17T18:41:01","slug":"lattrezzatura","status":"publish","type":"page","link":"https:\/\/projects.dimes.unical.it\/music4d\/lattrezzatura\/","title":{"rendered":"Equipment"},"content":{"rendered":"\n<div style=\"font-family: 'Segoe UI', sans-serif;color: #2c2c2c;padding: 60px 20px;max-width: 1100px;margin: auto;background: #fefefe;border-radius: 16px\">\n\n  <h1 style=\"text-align: center;font-size: 42px;margin-bottom: 10px;color: #1a1a1a\">\n    <strong>System Architecture Overview<\/strong>\n  <\/h1>\n  <p style=\"text-align: center;font-size: 18px;color: #666\">\n    High-Performance Edge Computing, Multi-Sensory Acquisition &amp; Embodied AI\n  <\/p>\n\n  <hr style=\"margin: 40px auto;border: none;height: 2px;width: 60%;background: linear-gradient(to right, #ccc, #eee)\">\n\n  <section style=\"margin-bottom: 50px\">\n    <p style=\"font-size: 17px;line-height: 1.8;text-align: center;max-width: 800px;margin: auto\">\n      The <strong>MUSIC4D<\/strong> technical architecture is designed to handle complex real-time processing of multimodal data. It combines a powerful Edge Node infrastructure for heavy computation, a delicate sensory interface for environmental inputs, and state-of-the-art robotics for physical interaction.\n    <\/p>\n  <\/section>\n\n  <section style=\"margin-bottom: 60px\">\n    <h2 style=\"font-size: 28px;margin-bottom: 20px;border-bottom: 3px solid #0078d4;padding-bottom: 5px\">\n      \u26a1 Computational Core: Edge Nodes\n    <\/h2>\n    <p style=\"font-size: 16px;line-height: 1.7;margin-bottom: 20px\">\n      At the heart of our processing capabilities are the <strong>Edge Nodes<\/strong>. These units are engineered to run sophisticated Deep Learning models (VLMs) and handle the &#8220;Tutti-bot&#8221; framework operations with low latency.\n    <\/p>\n    \n    <div style=\"background: #f0f7ff;padding: 25px;border-radius: 12px;border-left: 5px solid #0078d4\">\n      <h3 style=\"margin-top: 0;font-size: 20px;color: #004e8c\">GPU Acceleration<\/h3>\n      \n      <p style=\"font-size: 16px;margin-bottom: 0\">\n        To ensure maximum throughput for AI workloads, the nodes are equipped with high-performance graphics units. The architecture leverages the power of <strong>NVIDIA RTX PRO 6000 Workstations<\/strong> (or alternatively <strong>RTX A4500<\/strong>), providing the necessary CUDA cores for real-time inference, generative audio synthesis, and foundation model fine-tuning.\n      <\/p>\n    <\/div>\n    <p style=\"font-size: 14px;color: #666;margin-top: 15px\">\n      *Additionally, edge computing is supported by <strong>NVIDIA Jetson AGX Orin<\/strong> boards for localized robotic processing.\n    <\/p>\n  <\/section>\n\n  <section style=\"margin-bottom: 60px\">\n    <h2 style=\"font-size: 28px;margin-bottom: 20px;border-bottom: 3px solid #ff6f3c;padding-bottom: 5px\">\n      \ud83c\udf99\ufe0f Sensory Interface: Audio Acquisition\n    <\/h2>\n    <p style=\"font-size: 16px;line-height: 1.7;margin-bottom: 20px\">\n      To interpret the &#8220;voice&#8221; of the environment and the audience, the system utilizes a tiered microphone setup managed by a <strong>Focusrite Scarlett 18i20 (4th Gen)<\/strong> multi-channel converter.\n    <\/p>\n\n    <div>\n      <div style=\"flex: 1;min-width: 300px;background: #fff8f3;padding: 20px;border-radius: 12px;border: 1px solid #ffe0d0\">\n        <h4 style=\"margin-top: 0;color: #d64d00;font-size: 18px\">High-Fidelity Capture<\/h4>\n        \n        <p style=\"margin-bottom: 0\">\n          For critical audio analysis requiring maximum detail, we employ the <strong>Schoeps Flexi Set CMC6 MK2<\/strong>. These microphones provide the transparency needed for delicate acoustic nuances.\n        <\/p>\n      <\/div>\n\n      <div style=\"flex: 1;min-width: 300px;background: #fff8f3;padding: 20px;border-radius: 12px;border: 1px solid #ffe0d0\">\n        <h4 style=\"margin-top: 0;color: #d64d00;font-size: 18px\">Ambient &amp; Versatile Capture<\/h4>\n        <p style=\"margin-bottom: 0\">\n          To cover a broader range of inputs and ensure versatility across the performance area, the system integrates <strong>Behringer B5<\/strong> microphones for distributed audio sensing.\n        <\/p>\n      <\/div>\n    <\/div>\n  <\/section>\n\n  <section style=\"margin-bottom: 60px\">\n    <h2 style=\"font-size: 28px;margin-bottom: 20px;border-bottom: 3px solid #28a745;padding-bottom: 5px\">\n      \ud83d\udc41\ufe0f Visual Input\n    <\/h2>\n    <p style=\"font-size: 16px;line-height: 1.7\">\n      Complementing the audio, the visual perception system utilizes <strong>Sony FDR-AX43A<\/strong> (Night Vision) and <strong>Panasonic HC-VXF11EG<\/strong> 4K cameras. These feed high-resolution video streams into our VLM pipeline for emotion recognition and scene analysis.\n    <\/p>\n  <\/section>\n\n  <section style=\"margin-bottom: 40px\">\n    <h2 style=\"font-size: 28px;margin-bottom: 20px;border-bottom: 3px solid #8e44ad;padding-bottom: 5px\">\n      \ud83e\udd16 Embodied AI &amp; Robotic Actuation\n    <\/h2>\n    <p style=\"font-size: 16px;line-height: 1.7;margin-bottom: 20px\">\n      Transitioning from perception to physical interaction, the system integrates state-of-the-art humanoid robotics managed by generalist foundation models. This layer allows the &#8220;Tutti-bot&#8221; to inhabit the physical space alongside performers.\n    <\/p>\n\n    <div>\n      \n      <div style=\"flex: 1;min-width: 300px;background: #fdf2ff;padding: 25px;border-radius: 12px;border-left: 5px solid #8e44ad\">\n        <h3 style=\"margin-top: 0;font-size: 20px;color: #5b2c6f\">Unitree G1 Humanoid<\/h3>\n        \n        <p style=\"font-size: 15px;margin-bottom: 0;line-height: 1.6\">\n          The physical avatar of the system is the <strong>Unitree G1<\/strong>. This humanoid agent offers high-torque joint movement and agility, serving as the kinetic output for the generated behaviors. It translates the AI&#8217;s &#8220;emotions&#8221; into gestures and spatial navigation.\n        <\/p>\n      <\/div>\n\n      <div style=\"flex: 1;min-width: 300px;background: #f0fff4;padding: 25px;border-radius: 12px;border-left: 5px solid #2ecc71\">\n        <h3 style=\"margin-top: 0;font-size: 20px;color: #1e8449\">NVIDIA Project GR00T<\/h3>\n        \n        <p style=\"font-size: 15px;margin-bottom: 0;line-height: 1.6\">\n          Orchestrating the robot&#8217;s cognition is <strong>NVIDIA Project GR00T<\/strong>. This general-purpose foundation model enables multimodal understanding (language, video, and demonstration), allowing the robot to learn coordination and interact naturally rather than following hard-coded scripts.\n        <\/p>\n      <\/div>\n      \n    <\/div>\n  <\/section>\n\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>System Architecture Overview High-Performance Edge Computing, Multi-Sensory Acquisition &amp; Embodied AI The MUSIC4D technical architecture is designed to handle complex real-time processing of multimodal data. It combines a powerful Edge Node infrastructure for heavy computation, a delicate sensory interface for environmental inputs, and state-of-the-art robotics for physical interaction. \u26a1 Computational Core: Edge Nodes At the &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/projects.dimes.unical.it\/music4d\/lattrezzatura\/\" class=\"more-link\">Leggi tutto<span class=\"screen-reader-text\"> &#8220;Equipment&#8221;<\/span><\/a><\/p>\n","protected":false},"author":7,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/pages\/67"}],"collection":[{"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/comments?post=67"}],"version-history":[{"count":24,"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/pages\/67\/revisions"}],"predecessor-version":[{"id":210,"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/pages\/67\/revisions\/210"}],"wp:attachment":[{"href":"https:\/\/projects.dimes.unical.it\/music4d\/wp-json\/wp\/v2\/media?parent=67"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}